[NST/G] New Dictionaries

Search This thread


Aug 27, 2018
I am not sure what appinfo.db is but if that is some database of installed applications on device even that can be of some assistance as digging through there you could find out is there a dictionary on device and possibly where does it resides. Also those ".journals" could be handy if they are what I mean. If those are app log files analysis of those should lead you to a placement of dictionary .db file. Happy hunting!
I have been searching but there is no .db even in hidden folders lol


Recognized Contributor
Nov 21, 2013
It runs Android 4.4 from what i know, the files i found to root it said that device runs that version of android
So here is part of the confusion. The BNRV500 runs Android 2.3, Gingerbread. That's the Nook Glowlight (white case, no SD card). B&N is so unimaginative they can't come up with a simple system to name their own products so everything is mush.

What you have sounds like the Nook Glowlight Plus (the model following the BNRV500, I'm guessing). You can clear up some of this by looking at your Settings app. Somewhere in there is information "About" your device. It should give the model number in there.

But...all of this is moot if the dictionary is invisible.


Recognized Contributor
Nov 21, 2013
I'm not sure (because I never used it) but I think the Glows had you download a dictionary.
I can't really tell because I never put the WiFi on my Glow2 (BNRV510).
What do you get when you:
# am start -n bn.ereader/com.nook.app.dictionarylookup.LookupWordListActivity
For me it tries to connect to: https://docs.nook.com/cloud/login/AN-15146_v2/index.html#/
Yeah, now I am even more confused about the particular beast in question. The poster says BNRV500 and specifies Nook Glowlight Plus. Then he says the OS is KitKat. But everything I can find indicates that it should be Gingerbread for that model number and no "Plus".

The model you reference, BNRV510, appears to be the Glowlight Plus. The user manual makes no mention of the need to download a dictionary but describes the MW as "built in".

Either way, the lack of recognizable dictionary files on the device is puzzling.


Senior Member
Sep 19, 2020
Either it is online type of dictionary which is not nice from B&N or require download of word database. In the second case that could maybe an attempt from B&N to be nice and periodically update word database. Still if you can not find database on the device I can only stipulate that it is either on purpose hurried inside linux-ish portion of OS or maybe masked somehow in temporary files on the device with logic behind such move if it gets deleted download again. Man clearly stated that his device is rooted and he should be able to find those files - unless they used different kind of database file type on this device.

Top Liked Posts

  • There are no posts matching your filters.
  • 2
    Disclaimer 1: I am neither a lexicographer nor a linguist. The dictionaries I have put together or organized are as much demonstrations as they might be practical tools (maybe more demonstrations). I've tried not to introduce any errors, but I am not responsible for erroneous material that was already present.

    Disclaimer 2: No attempt has been made to alter or "improve on" the internal structure of the dictionaries, which are modeled after the stock dictionary and comparable to it in file size. These dictionaries may not be suitable for all users.

    Note: An earlier forum thread with the most information on alternative dictionaries is here.

    I first became interested in the structure and potential production of dictionaries for the NST/G when I was working on updating the UK version of the ROM for FW 1.2.2. I saw that dictionary management was built into the Settings app and that got me thinking. Of course it doesn't work on the UK ROM and I now think it's doubtful that there ever were any non-English dictionaries available for download. Still, the seed was planted.

    In conjunction with the release of a Dictionary Management app for the NST/G, I am making available a set of single language dictionaires and three sets of translation dictionaries for the languages originally supported on the UK ROM. These dictionaries are NOT, however, for the UK ROM but rather the more common US version of the OS.

    The first set of dictionaries I built from scratch using the "translation" table of Wiktionary databases and an adapted Python routine I discovered while researching. These contain more words than the second or third sets of dictionaries, but they include incomplete entries as well as complete ones. The simplest entries are just word→word with no other information. The more complete entries include a short contextual sense in the first language (practically a definition), the part of speech, gender (if applicable and available), as well as a list of possible words in the second language.

    Just as I had wrapped up my "final" pair of dictionaries my wandering searches delivered me to Wikdict. There the clever people had done some amazing cross referencing of the three databases that are involved with each language pair. From this they generated translation dictionaries in Stardict format. After decompiling one I was so impressed I started again on another set of dictionaries. This second set covers fewer words since only the complete entries from Wiktionary are used.

    And then there is the Wiktionary site of Matthias Buchmeier. Among others, he has assembled translation dictionaries in the ding format (text--yeah, I never heard of it either). I liked these also, and although they were more difficult to work with, I gave them a whirl to produce a third set of translation dictionaries. Now I'm in rehab.

    Edit: and from my little padded cell I managed to sneak a peek at the WWW and finally found a source for single-language dictionaries based on Wiktionary, thanks to Mickaël Schoentgen et. al. Although the Italian and Spanish dictionaries are on the small side, these are in a format similar to the Oxford English dictionary in which all senses and part of-speech variations of a word are in a single citation, so you can't really compare "word" counts with the translation dictionaries.

    Each dictionary consists of two files: basewords.db and inflectedwords.db. If you're not interested in the Dictionary Management app you can still use these dictionaries. However, contrary to some of the posts in the long-ago thread on alternative dictionaries, it is not wise to simply replace the stock files with these dictionaries (even after backing up the stock files). During the development of the Dictionary Management app I noticed that over time the available space in /system decreased during dictionary swaps. It took me a lot of fooling around and research to sort out the remedy, so if you elect to skip the app, be sure to follow the manual dictionary installation instructions carefully. Each of the basewords.db files that go from English→whatever use the same inflectedwords.db (not the same as the stock file). If the dictionary goes from whatever→English, there is an inflectedwords.db specifically for that language (the same inflectedword files are used for any of the three sets of dictionaries).

    For each dictionary you must download one basewords file and one inflectedwords file. There is only one inflectedwords file for each language, however, so you don't need to download duplicates. You want an inflectedwords file for the first language in a pair.

    Single Language (Wiktionary, after Mickaël Schoentgen et. al.)


    French (437,983 entries!)
    German (135,658 entries)
    Italian (52,626 entries)
    Spanish (55,631 entries)
    (you also need an inflectedwords.db file from below matching the language)



    Set 1 (direct from Wiktionary translation table)

    English→French (114,000 entries, 70,244 complete)
    English→German (81,200 entries, 68,874 complete)
    English→Italian (74,190 entries, 65,281 complete)
    English→Spanish (67,647 entries, 65,623 complete)
    French→English (103,760 entries, 90,230 complete)
    German→English (113,542 entries, 73,286 complete)
    Italian→English (54,743 entries, 29,204 complete)
    Spanish→English (51,124 entries, 13,101 complete)

    Set 2 (Wiktionary via Wikdict)

    English→French (44,208 entries)
    English→German (41,233 entries)
    English→Italian (35,971 entries)
    English→Spanish (39,855 entries)
    French→English (73,751 entries)
    German→English (52,115 entries)
    Italian→English (21,112 entries)
    Spanish→English (8,726 entries)

    Set 3 (Wiktionary after Mattias Buchmeier)

    English→French (81,464 entries)
    English→German (79,068 entries)
    English→Italian (59,474 entries)
    English→Spanish (72,176 entries)
    French→English (93,969 entries)
    German→English (83,569 entries)
    Italian→English (150,625 entries)
    Spanish→English (108,852 entries)


    English (105,660 forms)
    French (284,435 forms)
    German (631,222 forms)
    Italian (313,537 forms)
    Spanish (488,956 forms)

    For comparison with the values shown above, the stock basewords.db contains 86,301 entries. That is supplemented by the biographical and geographical dictionary which contains 15,745 entries. The dictionaries based on Wiktionary databases contain bio/geo entries (too many, if you ask me...) along with the other words, so for typical words you might look up the number of entries is somewhat inflated.

    Other dictionaries

    There are many, many, many language pairs among the databases and files from Wiktionary and Wikdict. I didn't look very closely and some might be quite small like Spanish (or even smaller!), but if you have an interest in trying your hand at a different combination, there is more information in the technical section on making dictionaries. There are single-language databases but they don't include definitions or even senses. They seem to mostly focus on statistics and also sometimes include genders and parts of speech. It is possible to construct a very simple dictionary using, say, the German-English pair, ignoring the English translations in the construction of the database. This would only work for the complete citations, of course, so you'd have smallish dictionaries. However, I think the data represented in the single-language dictionaries by M. Schoentgen are much more extensive.

    Problems, Probleme, problèmes, i problemi, problemas

    Single-word lookup for translation is rife with challenges, especially where multi-part verbs are concerned. It's also true that some words don't have single-word equivalents in some languages. Anyone who has ever struggled with a translation dictionary while enrolled in a first-year language course knows this already. Therefore the more contextual information available in the lookup citation, the better. That's one reason why I like the Wiktionary databases--when the citation is complete. Like Wikiwhatever, Wiktionary is, however, a work in progress. Data dumps are annual and one might expect some progress each year. Exactly why Spanish has received so little love is a mystery to me, considering the large number of Spanish speakers around the world. The small size of the Italian databases is a little easier to understand.

    For the most part, the Lookup function behaves as you might expect but French presents some problems. Lookup is equipped to deal with hyphenations that span lines (there are hyphen dictionaries for various languages buried in the NST). But Lookup bracketing eschews any other punctuation as far as I can tell. The rather extravagent use of the apostrophe (or what looks like an apostrophe--sorry, my French is limited to culinary and operatic) in written French means that only part of a word will sometimes be selected. It is possible to drag the selection brackets across an apostrophe and on to the terminus of a word, but it is difficult.

    Finally, the function of the inflectedwords.db is spotty. Based on my tests even the stock dictionary sometimes fails to pick up an inflection and properly direct it to the baseword ("word not found"), even though I can see with my own eyes the correct inflection and referral in the database. Easy enough in your native language to backup on an ending you recognize to find the baseword. But in a translation dictionary you may not be as familiar with inflections. Your mileage may vary.

    Irony \ˈīrənē\

    While working with the UK ROM got me started down this path, I have not been successful in creating any dictionaries for it. There is more discussion of this in the technical section on stock dictionaries. Supposedly it has been done in the past (see: https://forum.xda-developers.com/t/...nario-en-espanol-para-nook-glowlight.3554472/). While the post is not specifically about the NST/G, the dictionary format is apparently the same. I sent the member a message but as he has not been seen since 2019, the likelihood of a reply is pretty small.

    Manual dictionary installation

    My key discovery in the development of the Dictionary Management app is that com.bn.nook.reader.activities must be killed prior to moving any dictionary files. This process generally runs in the background and it has tendrils attached to the dictionary databases. If you don't kill the process, you may not get back all of the free space due when removing the stock dictionaries, or removing custom dictionaries and replacing with the stock dictionaries. At some point, you will not even have enough free space left to restore the stock dictionaries, even though there should be plenty. If reading this makes your head hurt already, you'd be happier installing the app. Otherwise, read on.

    You cannot safely swap dictionaries without ADB (or the app)

    Copy the basewords and inflectedwords files from the dictionary you want to the root of your sdcard.

    Connect to your device with ADB (via USB or WiFi). Execute the following:
    adb shell pidof com.bn.nook.reader.activities
    [a four digit number is returned*]
    adb shell kill [the four digit number--no square brackets]
    *If you get a blank response rather than a 4-digit number, it is likely that com.bn.nook.reader.activities has died a natural death. If so, you can skip the "kill" command.

    Now you need to backup the stock dictionary. If you use an sdcard that is easiest:
    adb shell mkdir /sdcard/Dictionary
    adb shell mount -o rw,remount /dev/block/mmcblk0p5 /system
    adb shell cp /system/media/reference/basewords.db /sdcard/Dictionary/basewords.db
    adb shell cp /system/media/reference/inflectedwords.db /sdcard/Dictionary/inflectedwords.db
    If you are uneasy about this, check with a file manager that the operations above did what they should have (don't leave the ADB connection or you might have to kill com.bn.nook.reader.activities again!)

    Now to delete the stock dictionary (not the backup):
    adb shell rm /system/media/reference/basewords.db
    adb shell rm /system/media/reference/inflectedwords.db
    Now to copy the new dictionary to its proper place:
    adb shell cp /sdcard/basewords.db /system/media/reference/basewords.db
    adb shell cp /sdcard/inflectedwords.db /system/media/reference/inflectedwords.db
    adb shell chmod 644 /system/media/reference/basewords.db
    adb shell chmod 644 /system/media/reference/inflectedwords.db
    That should do it. When you access a book, com.bn.nook.reader.activities will restart and you'll be using a new dictionary.

    To restore the stock dictionary:
    adb shell pidof com.bn.nook.reader.activities
    [a four digit number is returned]
    adb shell kill [the four digit number--no square brackets]
    adb shell mount -o rw,remount /dev/block/mmcblk0p5 /system
    adb shell rm /system/media/reference/basewords.db
    adb shell rm /system/media/reference/inflectedwords.db
    adb shell cp /sdcard/Dictionary/basewords.db /system/media/reference/basewords.db
    adb shell cp /sdcard/Dictionary/inflectedwords.db /system/media/reference/inflectedwords.db
    adb shell chmod 644 /system/media/reference/basewords.db
    adb shell chmod 644 /system/media/reference/inflectedwords.db
    You can remove the backup files in /sdcard/Dictionary with a file manager.

    The stock dictionary for the US version of the NST/G consists of four databases: basewords, inflectedwords, bgwords, and fwp. The files are found in /system/media/reference. The first two files are of primary interest. "basewords.db" contains the words that will first be checked once the Lookup function in the stock reader has selected a word.

    If no match is found, then the word is sought in "inflectedwords.db". This table consists of variations on the basewords based on case, number, tense, etc. (in other languages, gender also). If a variation is found it will point to an uninflected baseword. Some inflected forms are targeted to specific basewords to reduce spurious usages. For example, the inflected form "butterflied" points specifically to the verb baseword which means to spread out flat, as in "to butterfly a leg of lamb". The inflected form "butterflies", on the other hand, points to the noun baseword which refers to the insect as well as those funny feelings in your stomach. Many inflected forms are not so targeted.

    Lookup also checks bgwords.db, which contains biographical and geographical names. Finally, the fwp.db contains "common" foreign words and phrases which one might encounter in English. As the Lookup function can only select one word at a time (without a lot of fuss), this database is mostly useless. These last two databases can remain in place when alternate basewords and inflectedwords databases are substituted for stock. They will continue to function normally.

    The structures of the four databases are similar only in that the first field is populated by the potential Lookup word(s). In inflectedwords.db, the second field is populated by the baseword pointers, with [n] (where "n" is an integer) following pointers that are targeted to specific baseword senses.


    In the other three databases, the first field is likewise populated by potential Lookup words. The second field, however, consists of a binary file (or BLOB, in SQlite terminology) which contains the explanatory information (part of speech, pronounciation, definition, etc.). That information is formatted as HTML and then (pk)zipped. The same Lookup word with a different sense (take the "butterfly" example from above) will have a numerical indication after it, e.g., butterfly[1] and butterfly[2]. In the HTML structure this number will appear as a superscript before the word. So if one looks up "butterfly", both senses will be displayed, but in separate groupings. However, if an inflected form is the Lookup word, it is possible that only one sense will be shown since the inflected form might be targeted at just one sense for which the inflection would only make sense (as in the butterfly example).


    In the UK version of the NST/G, there is a single database file which serves the same purposes as the four files in the US version. The database is also located in /system/media/reference/. The filename appears to be somehow "locked", and the OS will not recognize any file that is not named "ox_en_GB.db". Also, there is a table in the database called "nook_metadata" that contains information which the OS apparently looks for when deciding whether to accept the database or not. These are mostly suppositions, but are reasonable in light of the behavior of the system when a different database is provided. The Nook App for Android uses a dictionary of the same structure, although it is a version of the Merriam-Webster Collegiate, 11th ed. (mw_11_en_US.db). That dictionary is not recognized by the UK version of the NST/G unless it is renamed and the nook_metadata table is replaced by the one from the stock dictionary.

    At first glance the structure of the ox_en_GB.db seems familiar, as if the basewords and inflectedwords tables as used in the US version have simply been combined in one database. Indeed, the structures of the two tables in the database do seem to be like the structures of the two individual databases in the US dictionary. But they are not.


    There are subtle differences. For example, there are no targeted inflected forms. This can be seen from the fact that there are no [n] values after any baseword pointers. So it should not be surprising that there are no [n] values after any of the basewords. Instead, all possible senses of the baseword are contained in the BLOB. If you look up "butterflied", you're going to read about insects, fluttery stomachs, AND legs of lamb.

    Biographical and geographical names are included in the basewords. There are a few foreign words and phrases listed also, but because of the one-word Lookup, you're unlikely to encounter them.

    Finally, the BLOB is (g)zipped, not (pk)zipped. Substitution of the wrong zip format results in an unreadable file.

    The UK version of the ROM also has the theoretical ability to download additional single-language dictionaries and exchange them for the stock dictionary without any technical magic on the part of the user. There is a Settings page for managing the dictionaries (which is how I got started on all of this) and the Lookup window also shows an option to change dictionaries. From a scan of the Reader smali files it is clear that downloaded dictionaries were supposed to end up in /data/media/B&N Downloads/Dictionary and that Oxford versions of German, French, Spanish and Italian were planned. Also, the Merriam-Webster English variant is mentioned. The two English dictionaries have code numbers, very long strings reminiscent of the strings used to identify B&N downloaded books. The same number string is still used to identify the M-W dictionary in the Nook App for Android. But that's about as far as I got. I can't say so definitively, but I suspect there never were any other dictionaries available (so the "All dictionaries are free" declaration on the Settings page is a classic case of an ironic "you get what you pay for"). Had there been, there surely would have been talk of them on the forum or in similar other venues online. The Nook App for Android still touts the ability to download non-English dictionaries (for free...) but when I tried it out there was never anything offered but the default M-W. I called B&N and eventually spoke to a human being who quoted chapter-and-verse from the user guide when I asked about this. I replied that I did exactly what it said but saw no other dictionaries. Then the person quickly pivoted and said "oh, you can't do that". Of course, I knew that already. Perhaps you have to be outside the US? I didn't ask and because I had significant doubts about the existence of said (free) dictionaries, I
    didn't pursue the bother of a VPN to test the idea.

    The UK ROM does not recognize other dictionaries placed in /data/media/B&N Downloads/Dictionary, not even the mw_11_en_US.db. I'm guessing that this is because downloaded dictionaries would be entered into the various reader databases just like dowloaded books. Without the proper entries (including that long number, no doubt), there is no recognition. So the dictionary management capability of the ROM is moot until someone a lot more clever than I am can suss out the information from the smali files.

    Finding free-use databases that make this endeavor worthwhile is not easy. Having a way to manipulate and eventually make this data into the basewords.db and inflectedwords.db which the NST/G uses is also a significant challenge. I did a lot of searching before happening on a Python routine for making dictionaries from tab-separated text files: https://github.com/geoRG77/nook-dictionary. As I know nothing about Python I was a bit skeptical of the authors claim that the script could be readily customized to suit! It turns out that with a good understanding of if/else conditionals and loop structures (and a more-than-healthy dose of chutzpah), it was possible for me to make the changes I needed, much trial and error later. I eventually also added a one-item table, nook_metadata, which holds source/date or edition information. This is for display purposes in my Dictionary Management app and has no effect on the function of the database.

    The next happy discovery was that there were annual SQlite database dumps of Wiktionary language pair data: https://download.wikdict.com/dictionaries/sqlite/. The "translation" table in the database was in many cases all the data that was needed to construct simple dictionaries. For languages in which nouns are gendered, the information was sometimes found in the translation table (Spanish) or sometimes only in the single-language database. In some cases gender information was missing altogether and I had to scrounge around for something online and integrate the genders of any matching nouns into the database. I did most of my work with a combination of Notepad++ and SQlite Database Browser, moving back and forth between the two formats as needed (SQlite can import and export TSV text and the software includes a place to enter sqlite commands directly). Only a few SQlite manipulations seemed to require SQlite from the command prompt. These either involved pragma changes or comparisons of data in two different tables which appeared to overtax the SQlite Database Browser and result in spinning circles and "Not Responding" messages (although sometimes a walk-away for a snack seemed to make all that eventually work out). Even Notepad++ sometimes found the large text files a bit unwieldy (BTW, Notepad++ can be configured to show tabs and spaces--very handy).

    When I had finished with my "last" dictionary I discovered that a data dump for 2022 had arrived but I didn't go back to compare. Instead I continued to scrounge around and found two other Wiki sources worthy of notice. The first is the Wiktionary site of Matthias Buchmeier. There are various formats of dictionaries built from Wiktionary data found there. The text dictionaries in ding format are fairly straightforward and in the general format:

    baseword {part-of-speech/gender} \pronunciation\ [some contextual info about source, auxiliaries for verbs, etc.] :: translation

    ("nouns" are not explicitly identified; instead a gender is given--a problem in languages that use neuter gender as both the gender and "noun" are given the same symbol). There's just barely enough markup to allow creative search/replace to generate a TSV structure (or even HTML, I guess), although there is not always a part-of-speech/gender entry or the [...] entry so that would need to be dealt with. Also, these dictionaries are sometimes significantly larger than others. Spanish→English, one of the smallest I have made from the translation table, contains 108,000 entries! So while these are quite simple, they may be worth a look. In fact, they are so straightforward you could probably do everything you needed to prepare them for the Python script with Notepad++ alone if you are very clever and careful. I was neither (or not enough of either) and these turned out to be very frustrating with lots of botched lines. But it can be done.

    Perhaps most interesting on that page is the link to Wikdict. The team there has very creatively cross-referenced the three databases for each language pair (well, four, I guess, if you count English) and created StartDict format dictionaries that are pretty impressive.

    In the end I decided to create yet another set of dictionaries based on the Wikdict output along with the simpler dictionaries I prepared myself from the database translation tables and the dictionaries after Matthias Buchmeier. In addition to the two other tools mentioned above I used stardict-editor for Windows. That was needed to decompile the dictionary from Wikdict. This results in a TSV file of the form baseword→HTML. The HTML is almost usable as is if the Python routine is altered yet again to accept it as the already-complete HTML string. That's what I did but not before changing a few things. The most egregious problem is that all proper nouns are listed as "pronouns". This required a case-sensitive SQlite approach to fix (looking for capitalized words listed as "pronouns"). I also removed the pronunciation guides with their arcane symbols few can interpret (and for which the NST/G might lack font support). I objected to grammatical genders being given as "male, female, neutral". Never in my various studies of languages have I ever encountered those terms. I opted instead for m, f, n. I also removed all instances of \n (probably a newline character). Finally I tightened up the HTML, preferring <p> to <div> which otherwise adds a lot of white space to a citation without some css to calm it down. If you were going to do this yourself and were not so persnickety, you could just replace all \n with <br/> and you'd have a working HTML citation string.

    Edit: and then...So I finally stumbled on what I had been looking for all along: single language dictionaries! These, too, are based on Wiktionary, but not on the database dumps, rather the massive complete data dumps of the site. The github site of Mickaël Schoentgen contains a number of dictionaries in Kobo, Stardict and other formats. The data is actually updated nightly! Of course, it's not a simple step from the Stardict format to basewords.db, but it's not that bad (except for the French dictionary which is HUGE). My only bone to pick is the lack of part-of-speech data.

    A general outline of how each of the sources was prepared for the Python routine would involve a lot of steps and I'm not proposing to list them here. If you are interested in trying the process yourself, let me know and I will provide more detailed information. Suffice it to say that the goal is to prepare a tab-separated text file consisting of whatever information you want to retain and free of any words or characters that either Windows or Python finds objectionable (there are quite a few...).

    Inflected forms are essential for a dictionary that is not to be endlessly frustrating, unless you are working with a language that is not inflected. I got lucky with this by closely reading an old XDA thread. There are some inflected word lists here: https://github.com/Tvangeste/dsl2mobi/tree/master/wordforms. These are in the format baseword:form1,form2,form3..... It was a simple matter to convert all the punctuation to tabs and then rejigger the Python routine to create an inflectedwords.db for a given language. In the case of English, I fell back on the stock database, but I needed to strip out the targets to specific senses since there would be no way of knowing if these were even valid in the various dictionaries. That part was easy but left me with multiple entries of the same thing. I eventually found a way using SQlite to eliminate all but one of each duplicate set. This is the inflectedwords.db that should be used with all of these dictionaries where English is the first language.

    If you want to create a dictionary you need an inflected word list unless your baseword source includes inflected forms (very unlikely) or the language is not inflected. If your choice is not among those in the link I gave in the previous paragraph, the hunt is on. First, check the single-language databases at Wiktionary. These are mostly statistical but also contain the word list used to create the translation dictionaries and sometimes gender and part-of-speech data. Also, the "form" table in the database sometimes has a table of inflected forms. It may need some massaging, but if it's there you will have a way to create a dictionary. If your language is not inflected you will still need a database with at least a single entry in the table or the NST/G will not recognize the dictionary at all.