Linguistby finngruwierI wrote this extension to provide an easy way for users to create a list of new candidate words to the existing spellcheck dictionary for their language. Just run the command "List Non-recognized Words" and you will get a list of all the words from your document that are not recognized during spellchecking. After quality-assuring this list (and removing wrong spelling etc.), simply send the document to the people who maintains the dictionary. You can also use it if you want to make a personal wordbook. There is also a command that makes a complete list with all the words from the document listed alphabetically (no dubleants). From several reasons I chose to write this extension in Python, which was probably not the easiest choice since Python support in OOo is not very well-documented yet. You are welcome to study my code a an example of a simple Python extension. License: opensource | Read license Further product information: Product details
|
||
Comments
WrongWordsList macro
this extension works in a similar fashion as this macro.
http://user.services.openoffice.org/en/forum/viewtopic.php?f=20&t=1222&s...
did you use that macro for inspiration?
Sub WrongWordsList
Dim oDocModel as Variant
Dim oTextCursor as Variant
Dim oLinguSvcMgr as Variant
Dim oSpellChk as Variant
Dim oListDocFrame as Variant
Dim oListDocModel as Variant
Dim sListaPalabras as String
Dim aProp() As New com.sun.star.beans.PropertyValue
oDocModel = StarDesktop.CurrentFrame.Controller.getModel()
If IsNull(oDocModel) Then
MsgBox("There's no active document." + Chr(13))
Exit Sub
End If
If Not HasUnoInterfaces (oDocModel, "com.sun.star.text.XTextDocument") Then
MsgBox("This document doesn't support the 'XTextDocument' interface." + Chr(13))
Exit Sub
End If
oTextCursor = oDocModel.Text.createTextCursor()
oTextCursor.gotoStart(False)
oLinguSvcMgr = createUnoService("com.sun.star.linguistic2.LinguServiceManager")
If Not IsNull(oLinguSvcMgr) Then
oSpellChk = oLinguSvcMgr.getSpellChecker()
End If
If IsNull (oSpellChk) Then
MsgBox("It's not possible to access to the spellcheck." + Chr(13))
Exit Sub
End If
Do
If oTextCursor.isStartOfWord() Then
oTextCursor.gotoEndOfWord(True)
' Verificar si la palabra está bien escrita
If Not isEmpty (oTextCursor.getPropertyValue("CharLocale")) Then
If Not oSpellChk.isValid(oTextCursor.getString(), oTextCursor.getPropertyValue("CharLocale"), aProp()) Then
sListaPalabras = sListaPalabras + oTextCursor.getString() + Chr(13)
End If
End If
oTextCursor.collapseToEnd()
End If
Loop While oTextCursor.gotoNextWord(False)
If Len(sListaPalabras) = 0 Then
MsgBox("There are no errors in the document.")
Exit Sub
End If
oListDocFrame = StarDesktop.findFrame("fListarPalabrasIncorrectas", com.sun.star.frame.FrameSearchFlag.ALL)
If IsNull(oListDocFrame) Then
oListDocModel = StarDesktop.loadComponentFromURL("private:factory/swriter", "fListarPalabrasIncorrectas", com.sun.star.frame.FrameSearchFlag.CREATE, aProp())
oListDocFrame = oListDocModel.CurrentController.getFrame()
Else
oListDocModel = oListDocFrame.Controller.getModel()
End If
oTextCursor = oListDocModel.Text.createTextCursor()
oTextCursor.gotoEnd(False)
oListDocModel.Text.insertString (oTextCursor, sListaPalabras, False)
oListDocFrame.activate()
End Sub
WrongWordsList
No, I haven't seen that macro before. But as far as I can tell from the code, it basically does the same thing. It i easier, though, to install an extension than to install a piece of macro code. :-)
Finn
GUI language vs. document language.
that macro however does a better job since it retrieves just mispelled words.
your extension has problems becuase it checks GUI language not the document language.
i write in italian and Linguist selects almost any word.
i'll keep using the macro. you should fix that feature.
GUI language vs. document language.
I just installed an Italian dictionary and tried to run Linguist on an Itialian text after having set default document language to Italian. In a text with 145 words it just found 4 unknown words.
Just set default document language correctly in the Tools > Options > Languages menu, then it will work.
Btw: One of the four unknown words was 'Berlusconi' :-)
OOo 3.0 beta: ‘List Unrecognized Words’ lists all words
I've installed the latest release of this extension and have run it on several documents with Russian as the language of the Default style. In all instances, the result is the same: ‘List Unrecognized Words’ lists all words, not only misspelled ones. Is this because my GUI language is English (as Russian is not available so far)?
Please check the new version...
Finn Gruwier Larsen
OOo 3.0 beta: ‘List Unrecognized Words’ lists all words
Thanks for your comment. I am sorry to say that the current version still checks the GUI language, not the document language. I have planned to make a new version that checks the document language instead. I will see if I can get it done soon.
Finn Gruwier Larsen
Nice tool!
I find this tool really useful!
Only, I would prefer if the statistics would open in a popup, instead of in a new document....
Thanks for this useful extension.
jo
New version ready
Version 1.1 adds readability (lix) measuring and other document statistics.
It doesn't seem to work
For big documents (not so big ones), the app freezes.
I tried with just one paragraph with few unrecognized words, it creates a new void document.
Please send a document that
Please send a document that doesn't work to finn (at) gruwier.dk.
Finn Gruwier Larsen
Language: Danish
Finn,
Great script for many reasons. But you forgot to say, that the language is Danish ;-)
Leif Lodahl
http://lodahl.blogspot.com
Language
Hi Leif,
It should definately work with other locales. As mentioned in the description, Linguist tests on the GUI locale setting to determine which locale to use for spellchecking. I've just verified that it works with US English. But there might be locales that it doesn't work with - especially if there are locales (other the Danish) that does not conform to the language-COUNTRY formula. In fact the script treats Danish as a special case, since for unknown reasons the GUI locale setting for Danish is not called 'da-DK', but just 'da'.
Finn
Document language
Couldn't Linguist check the language of the document instead of the GUI? I often work with documents in languages other than the interface language and would like to use Linguist to create wordlists for spellcheckers.
Language settings
Hi Erdal,
I don't think there is a setting for the language of the document as such. In fact, each paragraph of a document can have its own language setting. But there is a setting called "default language for documents". If I used this it would be possible to set this setting to the language on which you want to spellcheck and still have another language in the GUI. I will consider that next time I make a release.
Finn