Create your own dictionary : Stardict to Mdict

Search This thread

dam

Senior Member
Nov 21, 2006
65
3
Tokyo
I wanted to create a new dictionnary (japanese 4 kanjis) from the very good data base of stardict (http://stardict.sourceforge.net/Dictionaries.php).
I used Mdict to implement it on the pocket PC as it allows you to create your own dictionary. I will here expand/update the already available tutorial that is to be find inside Mdict converter.


So here we go, and it is all for free !

needed software : mdx builder
from : http://www.octopus-studio.com/download.en.htm (you will see that there are already some dictionaries that have been made)


Steps to convert Star dict dictionary files into mdx format:

1) download the dictionary files in tarbal format from http://stardict.sourceforge.net/Dictionaries.php
here we use 4JWORDS (4-kanji ideomatic expressions and proverbs)
from
http://stardict.sourceforge.net/Dictionaries_ja.php as example

2) extract the file into a temporary directory for example: c:\temp
There should be 3 files in c:\temp now:
4jword3.dict.dz
4jword3.idx
4jword3.ifo

4) copy the "convstar.exe" and "star_style.txt" into c:\temp too. You will find these files in mdxbuilder, convstar folder.

5) run command prompt (you will find it on windows XP on start/all programs/accessories)
command promp must show the path c:\temp> (because the files you want to convert are there)

then write : convstar 4jword3.ifo k2e.txt
You will obtain a new file in yuor temp folder named k2e.txt

6) Run MdxConvert, fill in these parameters:
Source: C:\temp\k2e.txt
Target: C:\temp\JMDict4k2e.mdx
Style: C:\temp\star_style.txt
Original Format: MDict(Compact HTML)
Encoding: UTF-8(Unicode) <---Must use UTF-8 for all stardict dictionaries
Title: JMDict japanese 4 kanjis to english
Description: <font size=5 color=red> japanese 4 kanjis to english</font>
don't touch the rest :)

7) Click Start

8) Done.

Note: Some dictionaries contains International phonetic alphabets(IPA) symbols, to display them correctly, you need to install true type fonts which support IPA, for example the Lucida Sans Unicode font in win98/2000/XP
( copy the windows\fonts\l_10646.ttf into PDAs \windows\ or \windows\fonts\, may need to soft reset your pocket pc)


Voila monsieur, it is done.:)

If, still, there is a dictionary you would like but you don't feel like you can do it, I can try to upload converted files here (if it is OK with Xda-developers).


Cheers !
 
Last edited:

dam

Senior Member
Nov 21, 2006
65
3
Tokyo
nice !

Thank you, I wil try to post more useful post on that subject !

cheers :)
 
Last edited:

Curious D

Senior Member
Dec 15, 2006
151
0
If, still, there is a dictionary you would like but you don't feel like you can do it, I can try to upload converted files here (if it is OK with Xda-developers).


Cheers !

Hi. I hope you are still around to see this request. I like WordNet and I see that WordNet is up to 2.1 while the one that is available is 2.0. I was wondering if you knew how to create the 2.1 database as the files are not quite the same as Stardict. Thanks.

ftp://ftp.cogsci.princeton.edu/pub/wordnet
 

-=Ri/\xim=-

Member
Aug 21, 2007
9
0
Just want to say thanks to dam. Great tutorial, I have just converted the Britannica Concise Encyclopedia from stardict to mdict and it worked like a charm.

-=Ri/\xim=-

P.S. I think the MDXbuilder application has changed since you wrote this tutorial as there is now another input panel called "data" when building the database module. I left this empty since I didn't know what it meant and it made no difference at all.
 

hairyleprechaun

New member
Sep 27, 2007
2
0
dam said
2) extract the file into a temporary directory for example: c:\temp
There should be 3 files in c:\temp now:
4jword3.dict.dz
4jword3.idx
4jword3.ifo

Hello dam, thank you so much for the instructions, but I am having problems extracting the 4JWORDS file. In fact, I don't understand how to extract it. I have tried using WINZIP and a RAR extractor but all to no avail. All I have in my temp directory is the single file that I downloaded from sourceforge "stardict-4jword3-2.4.2.tar.bz2"

Can someone tell me how to extract the 4JWORDS file into the three files
4jword3.dict.dz
4jword3.idx
4jword3.ifo

Thank you so much for your help!

hairyleprechaun
 

hairyleprechaun

New member
Sep 27, 2007
2
0
Hello Everyone,

Well, I managed to get the file properly extracted with 7-Zip and the I was able to create the k2e.txt file using the ConvStar application.

However, I have a problem now. When using the MdxBuilder application, the instructions say that I need to change the character encoding to UTF - 8 for converting stardict databases. Well, in my drop down box under encoding, I only have UTF - 16, ISO 8859-1, and GBK available. I don't have UTF - 8 available in the dropdown encoding box.

Will this cause a problem when converting the file?

I don't know if this will help or not, but on my computer I have Asian Languages turned on, and I often type in Chinese text.

Thanks for any help!
 

Curious D

Senior Member
Dec 15, 2006
151
0
I was looking at the Star Dict website and found Word Net 3.0 in the database. I tried to use Convstar for the wordnet.ifo file, but found that the program was not able to open up the file. Is this program only meant for the Asian languages or is the program just unable to open this particular file? Thanks.
 

corepda

Senior Member
Aug 4, 2007
1,059
34
California
Didn't worked for me mate. Can You plz provide me the link for Wikipedia file of .Mdx, I wish if there is any updation to the database.