Using FlexiTerm

Software requirements to run FlexiTerm:

How to run FlexiTerm:

Place plain text files into "text" folder.
OPTIONAL: Replace file stoplist.txt in "resources" folder with your own, if needed.
Run FlexiTerm.bat (Windows) or FlexiTerm.sh (Unix/Linux) at "script" folder from the command line.
Check results in "out" folder. They will be presented in different formats: txt, csv and html.

Folder structure:

bin: Binary (Java .class) files
config: Contains "settings.txt" file with configuration options for FlexiTerm
lib: External libraries required by FlexiTerm
out: Output files
resources: Contains text resources required by FlexiTerm, including
- resources/dict: WordNet files used by WordNet.java
- resources/models: Models used by the Stanford CoreNLP.
- resources/stoplist.txt : Stoplist used to filter out stopwords.
- resources/dictionary.txt : A list of distinct tokens used as a dictionary by Jazzy to suggest similar tokens.
script: Windows/Unix scripts to run FlexiTerm
src: Source (.java) files
text: Input text files.

Output files format:

output.csv : A table of results: Rank | Term | Score | Frequency
output.html : A table of results: ID | Term variants | Score | Rank
output.txt : A list of recognised term variants ordered by their scores.
output.mixup : A Mixup file used by MinorThird to annotate term occurrences in text.
text.html : Input text annotated with occurrences of terms listed in output.html.
log.txt : Listing output used for debugging.

Format of configuration file settings.txt:

Default parameters in settings.txt:

pattern = "(((((NN|JJ) )*NN) IN (((NN|JJ) )*NN))|((NN|JJ )*NN POS (NN|JJ )*NN))|(((NN|JJ) )+NN)"
max = 3 : Jazzy distance threshold: How many operations away? Reduce for better similarity.
min = 2 : Term frequency threshold: occurrence > min. Increase for better precision.
MIN = 9 : Implicit acronym frequency threshold: occurrence > min. Increase for better precision
acronyms = explicit : Acronyms have to be explicitly defined in text using parentheses.
profiling = 0 : Profiling is disabled

Provide feedback