Deepwalk

This crate provides a implementation of Deepwalk [Perozzi, Al-Rfou, Skiena 2014].

To the run you need to provide a whitespace separated edgelist of a graph where the node ids are integers. See karate_netwrok.txt for an example.

How To

The binary takes one arguments, a json file that contains paramter information.

deepwalk config.json

In config.json you must provide

learning_rate: float - The starting point for the learning rate. This is decremented linearly with each iteration.
vector_dim: usize - The dimensionality desired for the final output vectors.
walk_length: usize - The length of the random walk taken for each node
window_size: usize - The window size applied to the walk. Note nodes +-window_size are used (e.g. a window size of 2 give the two nodes before and 2 nodes after the target node as part of the window)
num_iterations: usize - The number of iterations to run the algorithm for. Note we have not implemented early stopping at this time.
input_file: string - The edge list file described above.
perf_file: string - A file location to write the performance information to (iteration learning_rate error) for each iteration
weight_file: string - A file location to write the final weights/vectors
nthreads: usize - The number of threads to use for running the algorithm. If 0 or 1 is selected this will run single threaded

Karate Example

We have included an example run of the algorithm on the karate network included in this directory.

The configurations for this run can be found in karate_config.json. We have plotted the output weights along with the canonical communities.

We observed similar results to what Perozzi, et.al saw in their runs on the Karate network. Communities were moved into similar posotions in the 2d space of the weight matrix.

Note on Error

Perozzi, et. al. suggested a configuration of walk_length = 40 and window_size = 10 as a good initial values. We have use this for our karate example. Interesting to note is that while the error does not converge and actually remains flat during training we still observe the separation of nodes into their canonical communities.

During other experiments where we have used deepwalk as an input into classification tasks on much larger graphs (deezer) we have seen improvement of the classifiers over random despite the error on the deepwalk not converging.

For this reason we suggest any evaluation of deepwalk parameters be done within the context of the final classificaiton or other use case.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
karate_2d.png		karate_2d.png
karate_config.json		karate_config.json
karate_example.R		karate_example.R
karate_network.txt		karate_network.txt
lastfm_asia_edges.txt		lastfm_asia_edges.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deepwalk

How To

Karate Example

Note on Error

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deepwalk

How To

Karate Example

Note on Error

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages