Notes From Nature Talk

Some useful tools

  • majortim by majortim

    Good tools for finding counties etc. are lists on wikipedia, there are lists of municipalities in each state of the USA (there are also similar lists for others).
    For example https://en.wikipedia.org/wiki/List_of_municipalities_in_Florida (via the linkbox you can also change the state).

    Posted

  • majortim by majortim

    Another good list, if you find binominal name that's hard to read but could be a tree https://en.wikipedia.org/wiki/Category:Trees_of_the_United_States

    Posted

  • majortim by majortim

    Geographic Names Information System, U.S. Geological Survey. http://geonames.usgs.gov

    Posted

  • majortim by majortim

    finding Counties and locations http://mapper.acme.com/ (credits: cerabilia )

    Collectors Names http://essigdb.berkeley.edu/query_people.html (credits: wekebu )

    Posted

  • majortim by majortim

    https://en.wikipedia.org/wiki/Category:Lists_of_mountains_of_the_United_States

    Posted

  • El_Lion by El_Lion

    Trouble reading texts? I had some records written in pencil which were hard to read due to a low contrast.
    The solution: "Sheen" the visual webpage filter. (Tip was from the War Diary Zooniverse project)
    https://chrome.google.com/webstore/detail/sheen/mopkplcglehjfbedbngcglkmajhflnjk?hl=en-GB

    Posted

  • HelenBennett57 by HelenBennett57

    Symbols to copy and paste:

    • Degree °
    • Female and male ♀ ♂
    • Plus-or-minus ±
    • Mu μ

    Posted

  • HelenBennett57 by HelenBennett57

    The Plant List, to search for scientific names - tip from AustinMast - http://www.theplantlist.org/

    Avibase, to search for bird names - http://avibase.bsc-eoc.org/avibase.jsp

    Posted

  • Mr._Kevvy by Mr._Kevvy

    Fractions to copy and paste: ⅛ ¼ ⅓ ⅜ ½ ⅝ ⅔ ¾ ⅞

    Posted

  • majortim by majortim

    example of "American cursive" http://www.newamericancursive.com/uploads/assets/samples/2ndgrade/Corrie.jpg by A2H

    Posted

  • majortim by majortim

    Some non-English graphems Ä ä å Å ç Đ ð ë ğ Ł ł Ñ ñ õ Ö ö Œ œ Ø ø Ř ř Ş ş Š š ß Þ Ü ü Ž ž

    Posted

  • riskingraisin by riskingraisin

    @Mr. Kevvy: if you want a bigger lookup list, I have the full ITIS list of vascular plants & bryophytes (56.5K species). Shoot me a message if you want a copy. You can also download it from their website, but the default formatting is a bit inconvenient.

    Posted

  • joanball by joanball scientist

    I am using this thread and others to start a formal list of FAQs and Useful tools. Please post more here if you have them and I will incorporate it. Thanks everyone!

    Posted

  • Mr._Kevvy by Mr._Kevvy

    Yay! Thank you. 😃

    And how could I forget my favorite utility to get the transcriptions done: ABBYY Screenshot Reader for Windows. Free to try for 15 days, $29.99 to buy. I don't think there's a Mac version but there are other equivalent apps. for Mac. It does OCR by reading text right off the screen from a defined area (also has other options and does screen capture too) and the text can then be pasted into a word processor for editing and cutting into the fields.

    It only works on (most) typewritten or printed output. Anything handwritten, even neatly hand printed, comes out as gibberish. Of course, you have to check over the text carefully afterwards to fix "scanno" errors. Some caveats: it does second-level word matching so close matches will turn into words, ie "jct" always comes out as "jet" and needs manual correction. Watch for doubled spaces; I search/replace two spaces with one space if I see one to get rid of them.
    Even with the slight drawbacks, I'd say it halves the time and effort to do the transcriptions, and anything to reduce tedium makes the job more interesting.

    Posted

  • HelenBennett57 by HelenBennett57 in response to joanball's comment.

    @joanball, that's great to hear!

    Posted

  • HelenBennett57 by HelenBennett57

    Special characters: á ä é è ê ë í μ ñ ó ö ü ~

    Special characters from the Swedish alphabet: Å å Ä ä Ö ö

    Posted

  • Mr._Kevvy by Mr._Kevvy

    Two (possibly!) excellent tips for Firefox users:

    How to enable spellcheck as you type in the Herbarium interface. Firefox by default doesn't spellcheck one-line fields; this will turn it on.

    How to edit the Firefox custom dictionary. Of course, as it's a plain text file, you can grab my Herbarium Wordlist Custom Dictionary above, rename the plain .txt file one to persdict.dat, rename your already-present persdict.dat in the right place and put it in the indicated path. Voila... spellchecking using a custom Herbarium dictionary right in Firefox in the Herbarium data entry interface. (Unfortunately it's case sensitive and so may flag all the capitalized words as being wrong... )

    Posted

  • joanball by joanball scientist

    Thank you all for your input. I just posted a blog with many of these tools and FAQs.

    http://blog.notesfromnature.org/2014/04/17/faqs-and-useful-tools/

    Posted

  • Mr._Kevvy by Mr._Kevvy

    More of a tip than a tool (but very useful due to the Twitteresque 140-character limit in specimen comments) :

    For links to this site ie to specimen comments and board posts, you can leave http:// talk.notesfromnature.org/ out of link URLs and the board will add it automatically. For example, rather than

    [ My Specimen Link ] ( http:// talk.notesfromnature.org/#/subjects/ANN0123abc )

    you could just put

    [ My Specimen Link ] ( #/subjects/ANN0123abc )

    and the result will be the same (spaces added in examples so that they don't turn into actual links.)


    Another excellent plant database is the USDA's at http://plants.usda.gov (ITIS refers to it often.) It has habitat/distribution maps for each species, useful for ambiguities.

    Posted

  • Mr._Kevvy by Mr._Kevvy

    Herbarium Cheat Sheet: Top redetermined/corrected scientific names for copy/pasting

    I keep this as well as other copy/paste strings open while transcribing. Saves a lot of time which would otherwise be wasted manually typing the same names rather than using OCR (they are often handwritten later onto a typed/printed tag.)


    Asimina pygmaea

    Asimina reticulata

    Aletris lutea x obovata

    Calopogon tuberosus var. tuberosus

    Cyperus entrerianus

    Cyperus erythrorhizos

    Cyperus esculentus var. leptostachyus

    Cyperus lupulinus ssp. lupulinus

    Cyperus pseudovegetus

    Cyperus retroflexus var. retroflexus

    Cyperus sanguinolentus

    Eleocharis obtusa

    Iris savannarum

    Fimbristylis autumnalis

    Kyllinga odorata

    Lachnanthes caroliniana

    Lachnocaulon anceps

    Lipocarpha micrantha

    Lycopodiella alopecuroides

    Lycopodiella appressa

    Lycopodiella prostrata

    Mayaca fluviatilis

    Najas guadalupensis var. floridana

    Ophioglossum petiolatum

    Oxypolis filiformis

    Palhinhaea cernua

    Platanthera cristata

    Platanthera nivea

    Polystichum acrostichoides

    Rhynchospora glomerata

    Sagittaria filiformis

    Sagittaria graminea ssp. chapmanii

    Sagittaria graminea ssp. graminea

    Stenanthium densum

    Syngonanthus flavidulus

    Thelypteris kunthii

    Triantha racemosa

    Woodwardia virginica

    Xyris ambigua

    Xyris difformis var. difformis

    Posted

  • Mr._Kevvy by Mr._Kevvy

    Herbarium & Macrofungi Wordlist Custom Dictionary that I put together, very handy if you use screen capture OCR and/or an external editor. I figured that if SERNEC has twelve million possible specimens to eventually transcribe, any tools to automate the process and make it faster and easier are essential!

    Features:

    • Contains~49K genus/species/botanist names and botanical terms (Complete ITIS listing)
    • Word 2003/2010, Open Office 4 and plain ANSI text formats
    • Simple install directions included
    • I'll try to update it occasionally if required

    Enjoy. 😃

    Edit: With huge thanks to riskingraisin (see above) I've updated this to v2.0 with the complete ITIS listing of genus/species names and it shouldn't require an update for a long time. Please send me a PM if you find anything missing.

    Edit2: Can really call it complete now... synonyms were missing. It's up to v3.0 with just over 45K entries.

    Edit3: See above... even works right in Firefox with no external editor required.

    Edit4: v3.1 uploaded; incremental minor additions as I find missing words (bio. terms) while transcribing (also started including geographical names ie towns, lakes.)

    Edit5: v.4.0 uploaded. I can't forget our friends transcribing Macrofungi, so the entire ITIS list of about 5,100 species of fungi... genus, species and var. names (with "orthographic variants" aka misspellings removed) is now part of the dictionary. This necessitated a name change so is now Herbarium & Macrofungi rather than just Herbarium. I'm also moving this post to the end so that it gets noticed. Hope that this proves as useful in Macrofungi as I have found it! I couldn't do without this (and OCR) and I think it's important when we start "Transcribe Once and Validate" which may make it easier for mistakes to slip through.

    Edit6: v4.1 uploaded; more incremental minor additions of missing bio. terms, geographical names ie towns, lakes, etc.

    Edit7: v.5.0 uploaded; very important update. Added all U.S. counties that weren't already there. As well removed 662 "orthographic variants" aka misspellings and database artifacts that had slipped through in the first ITIS export and were present in the dictionary; verified that none of them are valid in any records in any other contexts. (This took four days better spent transcribing... I hope someone is actually using this thing other than me!)

    Edit8: v5.1 uploaded; more incremental minor additions of missing bio. terms, geographical names ie towns, lakes, etc.

    Edit9: v5.2 uploaded; more incremental minor additions of missing bio. terms, geographical names ie towns, lakes, etc.

    Posted

  • HelenBennett57 by HelenBennett57

    The joy of ClipX

    Inspired by Mr Kevvy's cheat sheet above, I find the free tool ClipX incredibly helpful in my daily life and am about to start using it for repetitive transcription.

    ClipX lets you work from the last n of your clipboard entries, and is nicely configurable. I'm about to start using it for repetitive ornithology ledger entries, instead of retyping.

    Edit: transcribing an ornithology ledger page using ClipX took 27 minutes, as opposed to an hour or two without it.

    Posted

  • am.zooni by am.zooni

    A useful site for US counties is http://www.hometownlocator.com/. Since within a state you can list all the places that start with a specified letter, sometimes it can help if the city name is misspelled or if a couple of letters are illegible.

    Posted

  • Bonnie123 by Bonnie123

    Geonames to find the counties, also works to find the country.. http://www.geonames.org/

    Link to website that has the Alt codes for degree's/symbols etc works for a PC, not sure about for a Mac. http://www.tedmontgomery.com/tutorial/ALTchrc-a.html

    Posted

  • maggiej by maggiej

    If you're using Chrome, there's an extension called Aliaser that lets you create shortcuts for names/phrases that you find yourself typing over and over again. It can save you a lot of keystrokes. For instance, whenever I need to enter Tahquamenon Falls State Park, I now just type "tf" and press the Tab key. If you'd like to try it, go to the Chrome Web Store, click on Extensions, and search for Aliaser. It's the one from Daniel Woznicki. Full disclosure: he's my son. That said, I've been using it for a few weeks now and it works quite well. It's simple to add and delete aliases, and you can select the key you want to trigger the substitution. But I find that Tab works quite well for Macrofungi, and it doesn't interfere with the normal tab functions.

    One thing to note: after you add an alias, you have to refresh the page with the circular arrow to make it available.

    Posted

  • Mr._Kevvy by Mr._Kevvy

    I've been away for a while, but it looks like the job has been left in good hands as it's almost completed. I may resume transcribing if more images are loaded, ie if other herbaria join the program to get digitized, but little point now. (Just logged in to see what I had missed...)

    Anyways, on to the real point of this post: one of things I did over this time was dump Windows and install Linux on 7/9 of my computers. And I'm happy to say that, yes, Linux does offer a way to do screen capture OCR free, non-commercially, without having to buy anything.

    Brief directions: install the packages tesseract-ocr, imagemagick, scrot and xsel from your distro's package manager (this is extremely easy to do, just find and click Install, and will take maybe a minute for all of them... Linux is very easy to use these days compared to before!)
    Right-click desktop, choose to create new text document. Paste the following:

    #!/bin/bash 
    # Dependencies: tesseract-ocr imagemagick scrot xsel
    
    SCR_IMG=`mktemp`
    trap "rm $SCR_IMG*" EXIT
    
    scrot -s $SCR_IMG.png
    mogrify -modulate 100,0 -resize 400% $SCR_IMG.png #should increase detection rate
    tesseract $SCR_IMG.png $SCR_IMG &> /dev/null
    cat $SCR_IMG.txt | xsel -bi
    
    exit
    

    Save and give a .sh name, ie OCR.sh. Quit. Right-click the file, edit properties and enable running it as a program. Right-click desktop again, choose to open in terminal window. Make the terminal window tiny and stick it in a corner so it doesn't cover what you want to read ie the specimen card! Launch a text editor for editing. Type ./OCR.sh or whatever name you called it (you need the ./ so it doesn't interpret it as a command.) You can then drag a marquee around the text to read, and when you let go it will be OCRed and put in the paste buffer to be pasted into your document. To do another, you can switch to the terminal window and press up arrow to get the ./OCR.sh back then press Enter again. I've tried it on many images and it works quite well, certainly comparably to ABBYY Screenshot Reader, and much better by far than anything else I tried.

    I hope this proves useful over the long term of this and other text digitization projects.

    Edit: @am.zooni thank you for the fast reply... and compliments! :^) I'll be checking in as this develops to see if more images appear online.

    @HelenBennett57: You too! Looks like you both stuck around to finish up.

    @md68135: Geez... I even get a project scientist saying hello and welcome back. Thanks! I feel like a local celebrity. 😄 Looking forward to the beta and will certainly try to provide some feedback.

    Posted

  • am.zooni by am.zooni

    Hey! @Mr._Kevvy is back! Long time no see. I hope you've been well. You've been missed, and mentioned appreciatively by more than one of us for your tips and support, not to mention the massive volume of transcription work.

    NfN is going to be changing the interface soon, so there's a bit of a rush on to complete the current collections before that happens. See blog post 1 and blog post 2 for info about that.

    Posted

  • HelenBennett57 by HelenBennett57 in response to Mr._Kevvy's comment.

    Hello! Good to see you again!

    Posted

  • md68135 by md68135 scientist

    Hey @Mr._Kevvy!
    Great to see you again. As @am.zooni mentioned there is a lot going on at the moment with the new site.
    I have been really busy with that, but trying to keep one foot in the "old" system as well until we transition.

    Posted