Alex liste des fichiers resources
Intro
This is an annex of the application part of the Accueil project.
Location
Resources files are located in resourceDir ~/server/Alex/dico.resources
All files have extension .pk
Useful commands
>> alr(go to resource dir)
>> alex (view all available alex commands)
>> alex -pke <filename> (edit read-only file)
>> alex -pkv <filename> (view (cat) all file)
>> alex -pks <filename(s)>(view file summary, extreme lines)
>> alex -pkt <filename(s)> (create readable unpacked copies in tmp.unpack subdir)
Building pk files
Most or all pk files are built by the resourceBuilder process.
The following commands runs the resourceBuilder process (takes some time):
>> alex -r
- wn.id (wordnet keySet, like D285687)
(note that keys in the Alex keySet are generated, NOT read from files)
- id.en
- id.fr
- id.def.en (definitions)
- id.def.fr
- id.vari.en (variant words)
- id.vari.fr
- id.conceptPercept.en (mapping of concept to percept-words)
- id.conceptPercept.fr
- id.perceptVariant.en.ii (mapping of percept-words to variant-words)
- id.perceptVariant.fr.ii
- id.manifest.en
manifest, i.e. non-ambiguous readable expression of a concept)
manifest are built in step 7 of resourceBuilder
- id.manifest.fr
Q-Link Files
Both files list qualified links rebuilt from wordnet. The first one is one int[][] format, the second is a set of readable lines.
Line order is irrelevant.
Field 1 : concept index.
Field 2 : concept index
Field 3 : connecting concept index
- id.link.ii
- id.link
All files are used and created in the resourceBuilder processing.
They are derived from wordnet-downloaded files (in subdirectory WordNet-3.0)
- wn.concept
- wn.def.en
- wn.def.fr
- wn.id (keySet file)
line order : coindexal with concept index
field 1 : wordnet key
- wn.link
line order : irrelevant
field 1 : wordnet key (concept 1)
field 2 : wordnet key (concept 2)
field 3 : concept index of connecting concept
- wn.nvadNum (also used in concept class)
- wn.word.en
- wn.word.fr
- wn.wordlist
List of acceptable wordnet words (words with spaces, words starting with digits,.
.. are refused). Line order is irrelevant. Each line contains a word prefixed by one of VNAD.
- id.wnkey
mapping of concept index to wordnet keys
line order : irrelevant
field 1 : concept index
field 2 : wordnet key
XDXF File
XDXF processing is step 5 in resourceBuilder.
The input is xdxf subdirectory files, the output is xdxf.pk.
- xdxf
DELA files
They are derived from DELA-downloaded files (in subdirectory DELA)
DELA processing is step 2 and 6 in resourceBuilder
- dela.variants.en
- dela.variants.fr
- dela.word.en
- dela.word.fr
Fuzzy links file
Step 8 of the resourceBuilder generates the fuzzy link file:
fuzzy.ii
Running the resourceBuilder (output example)
Starting resource building steps 012345678 (october 2011)
------------------------------------
DataBuild step 0 (resb)
Create <wn.wordlist> file (list of qWords known in wordnet), with relevant indexing
------------------------------------
DataBuild step 1 (resb)
Creation of english langSet <wn.word.en> <wn.def.en>
Creation og <wn.link> and <wn.concept>
------------------------------------
DataBuild step 2 (dela)
Building qWords and qVariants for both languages: <dela.variants.xx> and <dela.words.xx>
------------------------------------
DataBuild step 3 (resb)
Builds <wn.id> <id.wnkey>
Builds <id.link> <id.link.ii>
------------------------------------
DataBuild step 4 (resb)
Translate definitions from english to french : <wn.def.fr>
Uses google translation api; works incrementally.
------------------------------------
DataBuild step 5 (xdxf)
Input is xdxf files in cfg.resourceDir/xdxf/.
Output is xdxf.pk and various id.*.<lang>.pk files
------------------------------------
DataBuild step 6 (dela)
Building qWords and qVariants for both languages: <id.vari.en> and <id.perceptVariant.xx>
------------------------------------
DataBuild step 7 (manifestBuilder)
Creates manifest file in both languages: <id.manifest.xx>
------------------------------------
DataBuild step 8 (fuzzyBuilder)
Creates from any interesting source fuzzy links in <fuzzy.ii>
Uses now wikipedia links.
Hunspell Notes (doubts about usage)
hunspell formatting !
guide for command(1) and format(4): man hunspell
Dic location:/usr/share/myspell/dicts
Web site : http://wiki.services.openoffice.org/wiki/Dictionaries
? see /usr/share/dict/...
? see /usr/share/stardict/...
? sudo apt-get install stardict