
CiteIt.net is an open-source project (MIT license) whose mission to create a higher standard of citation is accelerated through the help of volunteers.
Programmers:
If you know how to code (especially in Python) contact Tim to help out.
Top Tasks
Here are a few of the top-priority tasks you can help with. (The complete list is in Pivotal Tracker)

- Create WordPress Gutenberg Block Editor to Replace old WordPress v4 plugin
- Develop the ability to add metadata to quotes using custom Html attributes
- Verify Quote before allowing Citation: Enforce Rules or Flag Suspicious Quotes that do not exactly match the original version, allowing for structured variation:
- Modify Parsing to handle [C]apitalization:
- “[T]his is a quote that was pulled from the middle of a sentence, but capitalized to fit with the new context”
- Allow new [additional] words to be added if the word is surrounded by brackets.
- Modify Parsing to handle Elipses ..
- “The quote was pulled from a sentence .. and the middle was skipped. This was noted with an ellipse.” It would be nice if the middle of the quote could be expanded.
- Modify Parsing to handle [C]apitalization:
Create a script to “upgrade” existing Wikipedia quotes into Contextual Citations
- Allow unique specification if phrase occurs multiple times in a document.
- This could be done by specifiying enough of the “before” and “after” text to make the phrase unique.
- Design an interface and Citation Structure for Nested Quotes:
- How does CiteIt have to be modified to allow quotations of quotations, nested multiple levels deep?
- Make YouTube (and other) transcripts highlight the current word/s while the recording plays. This will likely require creating a format for an intermediate data structure which stores the start and end times for each word/phrase in the transcript. (example: YouTube Speech-Text API script)
- Create a Standardized Text Version of Every Submitted page. The standardized text version is used to create the contextual JSON file containing the 500 characters before and after the quote
- from HTML
- from PDF
- from Word Doc
- from Open Ofice
- from PowerPoint
- from Image
- Modify Document class to save a copy of the original file in its original encoding to S3-style storage
- call archive.org API to archive the citing and cited pages if the page is new or the hash has changed
- It would be nice if the archive process didn’t slow down the citation process. Perhaps this means that the archive process (which could take several seconds) should be done asynchronously.
- call archive.org API to archive the citing and cited pages if the page is new or the hash has changed
- Setup Tests to verify that changes to the web service do not break existing quotes.
- Setup CI Server to run tests before Github commits code
- Add Ability to Standardize Quotes on Canonical Sources
- The Bible has multiple versions and translations. Can quotation be done in a way that quotes from different versions and different websites are standardized?
- Create a streamlined process of uploading audio and creating transcripts from Google Speech-to-Text. See work already done in Git Repo. Example: Otto von Bismark biography. (Links)
- Develop a web interface to crowdsource the process of cleaning up auto-generated transcripts. Would a wiki be of use here?
Interested?
- Contact Tim if you’re interested in helping.