Why the repetitive transcribing?

by reddder in response to j-walk's comment.

Pop & wreness: I have experienced what you both describe, but I've seen no reason to believe that 'someone is checking all this as its progressing and making sure things are going ok on the server/technical end. And reading the discussion suggestions'. If they are aware, they have not deigned to communicate their thoughts to the workforce. We are like the proverbial mushrooms . . . and it gets more annoying as the months pass by. If things can't be improved, just tell us: I think we can handle it. Being ignored is an altogether different kettle of fish.

Posted March 24, 2014 5:05 AM

by j-walk

Maybe the true purpose of the site is to analyze a participant's reaction to stress. : )

@ dmbrgn: Did you notice your special mention in the March 21 blog entry?

Posted March 25, 2014 3:26 AM

Yes, i was quite surprised. They succeeded in scuttling, to a small degree, my complaints about lack of open and mission-driven communications. Nevertheless, I say 'thank you' for the recognition.

Posted March 25, 2014 4:53 AM

by wreness

Where did you get a mention? What blog? Are you famous now? 😃 Kudos! I say you have to buy us all pizza now.

I just passed 3000 entries and my stress level has passed "mildly amused" to "sarcastic", so it's not looking good here.

Have you gone over to Galaxy Zoo? That place is so busy it's incredible. Boards with thousands of posts. Every hour, every day, people of Import answering every question and engaging in discussions, thanking the volunteers. It's like Disneyland over there.

Posted March 26, 2014 4:57 AM

by j-walk in response to wreness's comment.

http://blog.notesfromnature.org/

Posted March 26, 2014 11:16 AM

by Hayduke13

I came across the Essig Museum's database (http://essigdb.berkeley.edu/) when looking up a collector. Then I noticed that the record I was transcribing was already in their database and was noted as being added some time in 2012. What gives? Are some records being transcribed again?

CASENT 8177313; collector EE Ball Jr.

May be that this is an anomaly. The next few records I checked didn't seem to be in there.

Posted March 26, 2014 4:02 PM

by CTidwell3

That's basically what I have seen too. Only some records are there in that database. It provides a good base to compare how well transcriptions by a professional compare to consensus data from a project like this.

Similarly, if the quality of data from this projects is confirmed to be high, it can help find typos and mistakes there. I know I have found errors looking in the Essig database, where I have seen 10+ records from the same location and collector with the same month and day, but different years where I was pretty sure they should all be the same year.

Posted March 26, 2014 8:50 PM

by reddder in response to wreness's comment.

Well, I've reached the 12,000+ level for notes from nature, plus another 30,000 on other projects [weather, serengetti, asteroids, etc.] and my observations have become far more pointed. I mention the numbers not to blow my own horn, but because I believe in the work and I usually find it interesting. However, the 'Bird Ledger project' finally did me in. I just couldn't keep typing the same info a hundred times. The big joke about that project is that they say you can do a page in 15 min! Well, I can't!!!! I'm back to CalBug. With so many specimens to record, you think OZ would want to make things work better by eliminating unnecessary steps/keystrokes. However, I sometimes get the feeling that OZ doesn't care, as long as those w/picks & shovels continue the grind. If anyone 'on the bridge' is listening: give us some new tools so we can get the job done in our lifetimes.

Posted March 26, 2014 9:21 PM

by wreness

WOW dmbrgn, congrats! I think. Amazing. My liege :curtsey:

I'm not sure what I have here as it wasn't recording my counts for a long time, now it's over 3000 but I've done a few thousand on the sea floor, too, I'm nuts about that but am sick of scallops. I loved the ornithology stuff but I agree 15 minutes? Puh-lease. Gawd, it'd take me an hour or two sometimes. I'd type the thing in a WORD document as I went to keep my place and cut/paste the redundant info back and forth but when you'd get a page where you had to type a lot, MAN. It was work. I tried the transcribing of the music pages recently but I'd rather eat dead worms. ZZZZZzzzzzzzzz.

@Hayduke - I asked that same question of one of the Scientist gods here awhile back (I'm not as polite as you are..it was more like "HEY why am I typing all this up when you already have it???fer cry sakes") I was told that that's part of the point here - they are comparing our data with what they already have to make sure it's accurate (seems kinda redundant but who am I to quibble) and then of course there's all the ones that are NOT entered, that they are needing to enter.

In the case where they already have the data listed, then, I'm thinking it would have been a lot more efficient if they just had us confirm it was all OK and then be done with it instead of triple-quadruple handling something that's already a done-deal. I'd rather have known about the database, pulled up a bug, looked to see if it was listed, compared the tag & database, said "yeah it's all ok" or "I added such and such" and then check it off the Galactic Input List. But then I'm just a Minion 😃

Picks and shovels. I like the visual. Goes with the Pitchforks and Torches.

Posted March 27, 2014 2:54 AM

I question not only presenting us with the same entries repeatedly (I just had one like this as I found had already commented on it which prompted me to find this thread) but the necessity of having always the same number of multiple transcriptions (four) in the first place. It seems like a vast waste of effort if only 1/4 of what we are doing is being kept.

This could all be alleviated if records that had even a single transcription would still appear in the approval process and could be approved as such if they were done properly the first time and marked as complete so that no one else would have to redundantly transcribe them again. And if not, then they would remain in the active set to be randomly assigned to other transcribers until such time as they were completed properly.

Trying to think of the long-term, if it's actually planned to bring all twelve million SERNEC-member specimens onto this project, a change like this is absolutely essential. This project has been around for about a year and has gathered ~153,000 transcriptions. If 12M SERNEC records are digitized and brought online, that will turn into 48M transcriptions required. At the same rate, this will take over 313 years to complete. 😃

Posted April 13, 2014 2:54 PM

by HelenBennett57 in response to dmbrgn's comment.

Thank you very much for quantifying the problem with which we are confronted. You and others have put forth possible solutions/improvements to our work methods, but the management says we are too poor or too undermanned to enact these changes. Why field such a task, if it is virtually impossible to complete?

Posted April 13, 2014 6:29 PM

Needed next: the quantification of how much would be saved by some of the simpler UI improvements.

Mr. Kevvy - the multiple transcriptions are compared automatically like gene sequences and a unified transcription produced as the result. There was a blog post somewhere... could dig it out. I felt much better about redundancy and data quality after reading it.

Posted April 14, 2014 6:11 PM

by joanball scientist

Here are the blog posts on the process of checking transcriptions:

http://blog.notesfromnature.org/2014/01/14/checking-notes-from-nature-data/

http://soyouthinkyoucandigitize.wordpress.com/2014/01/14/412/

Your comments about problems and potential solutions are very valuable to this project. I am forwarding all of this to the steering committee to see what can be done, and will keep you posted. Thank you!

Posted April 14, 2014 6:26 PM

by robgur scientist, admin in response to Mr. Kevvy's comment.

We have the same thought on how to improve the UI and hope we can potentially implement that solution sooner than later. Notes from Nature is a labor of love but there is absolutely no spare capacity from the Zooniverse, really, and the science team doesn't have the resources right now to help. We are hoping to get there sooner than later... and we agree that these are great ideas!

Posted April 14, 2014 7:37 PM

@ joanball & robgur: Thanks, and it's good to see the project scientists back around here again. :^)

Posted April 14, 2014 11:56 PM

by Mr._Kevvy in response to dmbrgn's comment.

Has anyone responded to your calculations that it will take 315 years to finish this project following the current modus operandi. BY that time, humans may have evolved beyond processing information in this form and all will be moot.

Posted June 5, 2014 10:51 PM

Nary a peep since joanball above indicated it would be passed on. However that at least was something, and more than expected. 😃

Posted June 5, 2014 11:22 PM

by darryluk in response to dmbrgn's comment.

Cut and paste makes it easy!

Posted June 28, 2014 8:59 PM

As well as transcribing here, I also do plenty of Citizen Science with BOINC, which is a platform for distributed computing primarily for scientific research.

Much like this platform, results need to be gathered from multiple random participants for each parcel of work to prevent bad/fake results and for accuracy. What a BOINC project does to minimize redundancy is to initially send each "work unit" to two participants only. When the results are returned, they are compared. If they match, then this is considered a completion. If they don't, only then is the work sent to a third participant; it will be re-sent as many times as required for concurrence.

This system has been place for about a dozen years for several dozen BOINC projects, so it seems to have been tested as the most efficient and reliable.

Posted June 29, 2014 5:42 PM

Unless, I've been asleep at the switch, I haven't seen anything responding to your msg about the elephant in the room: 313 YEARS to finish this project at the current rate of activity. Such a statistic really makes the whole project pointless. I still keep plodding away, hoping there will be some fix to the 4 transcription rule for each entry. If Jon Stewart were interested in citizen science projects, I'm sure we'd get some pointed barbs thrown this way. I recently saw the Monty Python reunion show, and when they talked about the success of the British Radio Ballet, I realized what a brilliant skit could be built around a project that is not looking for a cure for cancer in 13 yrs,. but a project that is simply hoping to digitize archived specimens of life forms in 300 years.

Posted August 9, 2014 10:16 PM

I still plod away too... I'm sure it raised an eyebrow or two and I expect an improvement will eventually appear. Perhaps it isn't planned to bring the entire set of SERNEC-member herbaria in, but if it is, I think one will be required! But, I still enjoy transcribing (I do it with headphones on, and go off into The Zen Zone... it's peaceful.) So I'll keep at it pending an outcome. Hope I didn't scare anyone off. 😄

Oh, and I think that all the mistakes I am finding and reporting are possibly more valuable than the actual transcriptions (especially with the transcriptions being worth 1/4 their apparent data size)... the output isn't going to be worth anything with thousands of errors in it. So that also keeps me going.

Posted August 9, 2014 11:10 PM

by reddder in response to robgur's comment.

Well, I'm w/you. I'm hooked -- just feel the need, occasionally, to let management know we're still at our posts.

Posted August 9, 2014 11:49 PM

by robgur scientist, admin

We totally agree re: rate of transcription and one of the highest priority plans is to move to a "transcribe once and validate" approach, which I think would be much more efficient. The only reason we haven't implemented is "capacity" --- we don't have the resources yet to make changes like that to how NFN works. We are very actively pursuing the resources to make such changes, though, and hope you are willing to stick it out while we move from what we all think is a working but still "not quite there" prototype to something much better, and much more engaging and useful.

Posted August 10, 2014 3:13 PM

Your message brings hope & joy. Yes, I'll stay at it in the hope of seeing the promised land.

Posted August 10, 2014 5:55 PM

by Mr._Kevvy in response to robgur's comment.

Yay! Thanks for the confirmation.

Posted August 10, 2014 10:07 PM

by robgur scientist, admin in response to dmbrgn's comment.

We are working hard on a "ditto function" that can hopefully help a bit here for repetitive entries. We know it matters!

Posted August 11, 2014 3:22 PM

by reddder in response to robgur's comment.

Thanks again for your answer. The 'ditto function' will really enable us to up the productivity and get the info 'on the street' a lot sooner.

Posted August 11, 2014 7:00 PM

by robgur scientist, admin

Hoping this can happen really soon --- we may prototype within a week or two.

Posted August 12, 2014 6:40 PM

Did we just go to the prototype of "transcribe once and validate"? I got logged out of my session and then the numbers changed drastically for the completions to:

Total Images: 52,552
Active Images: 14,166
Complete Images: 38,386
73% completion (from 94%)

If so, yay! And I will be more careful as I hope everyone will with the transcriptions now to minimize errors.

Posted October 9, 2014 3:35 PM

by HelenBennett57

Surely some of us would have noticed that we're being expected to do some validation? I'm assuming something will/would show up in the UI and that it would be loudly announced - @Robgur is that right?

Posted October 14, 2014 2:11 PM

by Mr._Kevvy in response to robgur's comment.

Why would you assume that anything would be loudly announced? I'm hoping Mr. K is on to something. What is beyond dispute is that the management made radical changes to the program statistics about a week ago w/no explanation whatsoever. Why can't someone take the time to forward an explanation for those who toil in their vineyard?

Posted October 15, 2014 5:09 PM

by HelenBennett57

Hehehe, cos I'm still not entirely 100% cynic! ... yet. And also because there was a blog post about the new ditto function in Calbug, so I'm assuming an even bigger change would definitely get a blog post.

Posted October 15, 2014 5:13 PM

by DZM admin

According to the development team here at the Zooniverse, Notes from Nature is still very much operating under a four-transcription model, and the 1+validation model remains a long-term plan.

Hope that this helps!

Posted October 17, 2014 4:20 PM

by robgur scientist, admin in response to dmbrgn's comment.

Yes we have been meaning to post about the changes to the numbers. The long and short is that we haven't ever tallied the Ornithology ledger work in Notes from Nature, and we don't yet have a way to do that effortlessly --- it seems like a simple fix but its not, unfortunately. So we manually calculate effort (# of records completed) every couple weeks, and this increments numbers on the main page. I will sketch a post on this because it deserves further broadcast.

Posted October 17, 2014 4:20 PM

by robgur scientist, admin in response to robgur's comment.

Oh we also changed our collections numbers as well for a variety of reasons. I will fold this all into one blog post! This weekend.

Posted October 17, 2014 4:29 PM

Thanks for the update also to DZM... I'm thinking that it was a coincidence: the completed Herbarium images that were quad-transcribed were removed from the project, and this happened to be 75% of the total, which made it appear that Transcribe Once and Validate was now active.

Ah well, we stick with the 300+ year estimate for a while if that is the case. 😃

Posted October 17, 2014 5:00 PM

by Mr._Kevvy in response to dmbrgn's comment.

I'm kind of disappointed as I won't be here in 2314 when the project is completed. Hopefully, our work will be made available before the last item is processed. Mr. K: thanks for the too-good-to-be-true assessment. I enjoy your observations.

Posted October 17, 2014 6:24 PM

Thanks. 😃 I guess we are akin to the ancients who planted trees and vines that grew to tremendous size that they never got to enjoy.
Back to the vineyard for me!

Posted October 17, 2014 11:37 PM