-
Notifications
You must be signed in to change notification settings - Fork 1
emu support for prefetch #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: electron-dev
Are you sure you want to change the base?
Conversation
medav
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I'd like to understand the intent of this PR: it seems to me like we are conflating functional correctness, with performance modeling.
In practice, the architecture is totally okay if a prefetched block is evicted at some point. Meaning, we can have a scenario where some data gets loaded by the SIMD core that misses in the L1d, which would trigger an assert in this code.
We definitely need a cache model when we start doing the L1 performance model, so I'm willing to put this code in once you address the comments, but I want to clarify that this is not needed to provide functional correctness for software developers writing operator code, so this is not really needed to call the emulator "functionally correct"
| struct Cache { | ||
|
|
||
| private: | ||
| struct Data * cache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to store data at all? We can just let the emulator access memory through direct access and just track the cache tag array for detecting hit vs miss
|
|
||
| private: | ||
| struct Data * cache; | ||
| uint64_t * tag; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use std::vector
| } | ||
|
|
||
| ~Cache() { | ||
| free(cache); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to free tag here, but don't need that if you use std::vector
| Cache(const Cache&) = delete; | ||
| Cache& operator=(const Cache&) = delete; | ||
|
|
||
| static Cache& singleton() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be a singleton.
| } | ||
| } | ||
|
|
||
| uint8_t * lookup_cache(uint8_t * addr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function should just be bool lookup(void * addr)
Return true if addr is present in the cache, false if not
|
|
||
| public: | ||
| Cache(const Cache&) = delete; | ||
| Cache& operator=(const Cache&) = delete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we aren't making this a singleton, probably okay to not delete these
| free(cache); | ||
| } | ||
|
|
||
| void fetch_block(uint8_t * addr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be void fill(void * addr)
In addition, you can skip checking for a hit and just do assert(!lookup(addr));
| inline void vld(uint8_t * addr, VREG_T& dst) { | ||
| hooks->on_mem_read(tile_id, (uintptr_t)addr); | ||
| memcpy(dst.bytes, addr, 64); | ||
| static inline void vld(uint8_t * addr, VREG_T& dst) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
must not be static
| memcpy(dst.bytes, addr, 64); | ||
| static inline void vld(uint8_t * addr, VREG_T& dst) { | ||
| assert(64 % BLOCK_SIZE == 0); | ||
| for (size_t i = 0; i < 64; i += BLOCK_SIZE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are eventually just reading from *addr anyway... from above comment, just remove the data array entirely from the cache struct and use it just to track tags / presence of data.
|
|
||
| if(i > 0) { | ||
| Cache::fetch(addr); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So since the cache hierarchy is functionally transparent to the user (aside from PREFETCH1 instruction), this function can be a no-op.
I see what you're trying to do though so I'll suggest the following:
Have a hook here called on_prefetch (see #5). Some other location should be responsible for adding a listener on this hook to update a tile's private cache state (tags).
Please check if this PR looks ok.