k,v caching implementation in streamAtt and alignAtt

## ❓ Questions and Help

### Before asking:   
1. search the issues.   
2. search the docs.    



#### What is your question?

If we are using seq2seq model, and use streamAtt or AlignAtt then in streaming mode (audio coming in chunked fashion ) can we use k, v caching?

If yes do you think the cached values remain valid when a new audio chunk is added / concatted to previous accumulated audio?

If no any optimization suggestion to improve inference speed?





#### Code

   

#### What have you tried?

#### What's your environment?

 - fairseq Version (e.g., 1.0 or master):
 - PyTorch Version (e.g., 1.0)
 - OS (e.g., Linux):
 - How you installed fairseq (`pip`, source):
 - Build command you used (if compiling from source):
 - Python version:
 - CUDA/cuDNN version:
 - GPU models and configuration:
 - Any other relevant information:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k,v caching implementation in streamAtt and alignAtt #15

❓ Questions and Help

Before asking:

What is your question?

Code

What have you tried?

What's your environment?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

k,v caching implementation in streamAtt and alignAtt #15

Description

❓ Questions and Help

Before asking:

What is your question?

Code

What have you tried?

What's your environment?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions