news2meme

This is the repository with the code for the paper "News2meme: An Automatic Content Generator from News Based on Word Subspaces from Text and Image".

Small notes:

The work presented on the paper was obtained with a Matlab code which I cannot run because I do not have Matlab's license anymore. Therefore, I re-wrote everything in Python and I hope this change makes the project more accessible (even though it has been over three years since I published this paper ^^")
In the original code, I used pre-trained word2vec embeddings. However, in this repo, I opted for using pre-trained GLoVe just because it was a bit easier for me to work. I expect the results to be pretty similar for both.

Installation

Clone this repo:

git clone https://github.com/ericashimomoto/news2meme.git
cd news2meme

Create a conda environment based on our dependecies and activate it:

conda env create -n <name> --file environment.yaml
conda activate <name>

Where you can replace <name> with whatever name you want.

Only got to test this code on Mac. It should run with not much trouble on Linux. I have no idea if it will run properly on Windows (sorry!).

Download everything:

This script will download the following things in the folder ~/data/news2meme:
- The glove.42B.300d.txt pre-trained word embeddings.
- The news2meme dataset, with the following structure and content:
```
  |-- news2meme
      |-- catchphrases
          |-- catchphrases.csv
      |-- memeimages
          |-- full: Folder with the original images.
          |-- memeImages.csv
      |-- news
          |-- full: Folder with the images available with each news article.
          |-- newsinlevels_level2.csv
```
For News2meme, you don't actually need the images from the articles, but I included in the package as it might be useful for someone playing with the small dataset :)

Configuration

If you changed the download folder, make sure to change the path in settings.py

Train Mode

To be able to generate the memes, you first need to compute the subspaces for each meme image and each catchphrase in the dataset. To do this, run the following:

`python news2meme.py --config=experiments/train.yaml`

This code will save all the subspaces into the OUTPUT_PATH defined in settings.py. I set the contribution ratio to define the subspace dimensions to 0.5, but feel free to change this value in the yaml file to see how it changes the results.

Generation Mode

You can generate memes in three different ways:

Free mode: Put your news article in a .txt file and modify the experiments/gen_single.yaml file with the path of the file in the TEXT field. Then, run the following:

python news2meme.py --config=experiments/gen_single.yaml
Specific mode: Define the news article ID in the experiments/gen_id.yaml file in the NEWS_ID field. Then, run the following:

python news2meme.py --config=experiments/gen_id.yaml
Random mode: Generate a meme for a randomly sampled news article. Run the following: python news2meme.py --config=experiments/gen_random.yaml
Full mode: Generate a meme for each news article in the database. Run the following: python news2meme.py --config=experiments/gen_all.yaml

The news2meme will place a pickle file with the retrieved meme image and catchphrase id in the path specified by OUTPUT_PATH in settings.py. Again, I set the contribution ratio to be 0.8, but feel free to change this value in the respective yaml files to see how it changes the results.

Visualization

To visualize the generated memes, run the following:

python visualize.py --result=output/gen_memes_ids.pickle

Where output/gen_memes_ids.pickle specifies the results generated by news2meme. The generated memes will be saved in the folder specified in field VISUALIZATION_PATH in settings.py.

Citation

If you use our code or dataset, please consider citing our work:

@inproceedings{shimomoto2019news2meme,
  title={News2meme: An automatic content generator from news based on word subspaces from text and image},
  author={Shimomoto, Erica K and Souza, Lincon S and Gatto, Bernardo B and Fukui, Kazuhiro},
  booktitle={2019 16th International Conference on Machine Vision Applications (MVA)},
  pages={1--6},
  year={2019},
  organization={IEEE}
}

Questions?

If you have any questions or problems while running this code, feel free to open an issue or to contact me at kidoshimomoto.e@aist.go.jp

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__pycache__		__pycache__
experiments		experiments
sample		sample
.gitignore		.gitignore
README.md		README.md
data.py		data.py
download.sh		download.sh
environment.yaml		environment.yaml
news2meme.py		news2meme.py
settings.py		settings.py
subspace.py		subspace.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

news2meme

Installation

Configuration

Train Mode

Generation Mode

Visualization

Citation

Questions?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

news2meme

Installation

Configuration

Train Mode

Generation Mode

Visualization

Citation

Questions?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages