
I uploaded a Docker Image of my CiteIt Webservice to Docker Hub.
Building trust in media
by Tim Langeman
I uploaded a Docker Image of my CiteIt Webservice to Docker Hub.
by Tim Langeman
Here’s how to generate transcripts with the Google Speech to Text api.
I’m providing instructions for running this on a micro Debian instance on Google Cloud Compute Engine.
Here’s how to download an interview Tyler Cowen did with Malcolm Gladwell.
The apt-get option seems to have an outdated version of ydl: (I’m using Debian)
If you can’t run youtube-dl because debian has on old version, try:
sudo apt install python3-pip
sudo pip3 install --upgrade youtube-dl
sudo apt-get update
sudo apt-get install ffmpeg
youtube-dl --extract-audio --audio-format wav --audio-quality 5 --postprocessor-args "-ac 1" https://www.youtube.com/watch?v=ehlhrqSWPbo > malcolm_gladwell.wav
install gcsfuse to mount bucket
gsutil cp malcolm_gladwell.wav gs://bucket_name/
gcloud ml speech recognize-long-running \
'gs://citeit_speech_text/malcolm-gladwell.wav' \
--include-word-time-offsets \
--language-code='en-US' \
--async
config = types.RecognitionConfig(
sample_rate_hertz=44100,
enable_word_time_offsets=True,
audio_channel_count=2,
language_code='en-US'
sudo pip3 install --upgrade webvtt-py
import webvtt
for caption in webvtt.read(downloads/5BXtgq0Nhsc.en.vtt'):
print(caption.text)
by Tim Langeman
Although it looks as though the text etched into Panel Three of the Jefferson Memorial is a single quotation from Jefferson, it actually is a compilation of 6 separate quotations. Can you see what was lost when the context of each excerpt was stripped away?
Nothing is more certainly written in the book of fate than that these people are to be free.
Source: Wikipedia Commons.
These Individual Quotes are broken out on the official Monticello website into 6 separate quotations, although quotation #4 about the slaves’ freedom stops short of mentioning their deportation:
Some people argue that Historians should throw Jefferson out of the American pantheon for his views on race. Benjamin Schwarz argues in a March 1997 article in The Atlantic that this would be a mistake:
Jefferson was a complex man. What he takes away with one hand, he gives with another.
On the opposite wall from the Northeast Portico, the Southeast Portico provides another quote from Jefferson — an edited version of a letter to Samuel Kercheval (June 12, 1816) — that indicates that Jefferson was open to changing his mind and the law. Here is the original, without the editing:
I am certainly not an advocate for frequent and untried changes in laws and constitutions. I think moderate imperfections had better be borne with; because, when once known, we accommodate ourselves to them, and find practical means of correcting their ill effects. But I know also, that laws and institutions must go hand in hand with the progress of the human mind. As that becomes more developed, more enlightened, as new discoveries are made, new truths disclosed, and manners and opinions change with the change of circumstances, institutions must advance also, and keep pace with the times. We might as well require a man to wear still the coat which fitted him when a boy, as civilized society to remain ever under the regimen of their barbarous ancestors.
In the Jefferson Memorial, the southeast panel advocating laws and institutions change with “progress of the human mind” appears opposite the northeast panel with the cherry-picked quote about deportation. View the full-sized panoramic photo of the Memorial to inspect the inscriptions and see both panels side by side.
Jefferson Memorial, Washington, DC. Source: Wikiquote
Background Reading: Seeing White Podcast Series: Episode 4
Notes, ed. Peden, 163. Manuscript available at Massachusetts Historical Society.↩
Notes, ed. Peden, 163. Manuscript available at Massachusetts Historical Society.↩
by Tim Langeman
I released an updated version of the wordpress plugin tonight. This version handles the filename hashing a little bit differently. It escapes a list of characters such as apostrophes and quotation marks that can be problematic.
The basic functionality is now working pretty well, but the performance needs to be improved for pages that include multiple citations on a single page. The web service also has a memory leak that needs to be patched before it is really ready for prime time.
by Tim Langeman
Some Americans have become so disillusioned with the mainstream press that they refuse to trust basic reporting, assuming that even direct quotations may have been taken out of context.
Cherry-picked quotations are an old problem, but one which digital tools are uniquely suited to combat. In this article I’m going to demonstrate how a new set of authoring tools I’ve developed can allow authors to establish greater credibility, by providing readers the additional context they need to quickly evaluate a quotation’s context.
Take for example, the following cherry-picked quotation about slavery from Thomas Jefferson’s autobiography:
Nothing is more certainly written in the book of fate than that these people are to be free.
In this selective quotation, Jefferson appears to be a foreword-looking man, anticipating the emancipation of African-Americans. But if you click on the above down arrow, you can see that the text expands to reveal subsequent sentences which describe Jefferson’s desire to deport all the freed slaves, a part of history which is often omitted from our founding myths. 1
Quotations that appear in print are frequently devoid of context, requiring the reader to trust the author or follow a footnote. I call these “severed quotations” because the context necessary for readers to evaluate them has been chopped off. I use the term “severed” as a way to suggest that we need to establish a new norm for quotations — a norm which the digital medium has only now made possible, where the text surrounding a quotation can be expanded to provide readers with a fuller context. This new norm can help readers distinguish between those authors who desire to live to a higher standard of accountability, from those who play fast and lose with the record.
Take for example, the Saturday Night Live Sketch that Parodied Sarah Palin:
I can see Russia from my house!
This line was actually a exaggerated paraphrase from Sarah Palin that makes more sense if you read the original ABC News interview’s full transcript.
The interview was conducted in Alaska; and early in the interview ABC anchor Charles Gibson raised the topic of Russia’s proximity to Alaska:
GIBSON: Let’s start, because we are near Russia, let’s start with Russia and Georgia.
Later in the exchange, Alaska’s proximity comes up again, with Palin answering Gibson, by saying:
PALIN: They’re our next door neighbors and you can actually see Russia from land here in Alaska, from an island in Alaska.
I was not a Palin supporter, but I am sympathetic to conservatives who say that, whatever faults Palin may have had, ABC News and Saturday Night Live treated her unfairly by taking an otherwise innocuous statement out of context. It is an accumulation of many of these sorts of incidents that degrades conservatives’ trust in mainstream media sources like ABC News.2
If media outlets want to distinguish themselves from the competition, one visible way of doing this would be to adopt the sort of expanding quotations I’ve demonstrated and coupling this with a campaign to regain public trust.
I created the open-source Neotext project as a way to demonstrate the technical feasibility of this form of expanding quotations, and as a project to develop working implementations on many platforms.
You can experience what it is like to create your own Neotext quotations on my demo site by clicking on the “Test Drive” button below.
Test Drive: WordPress Plugin (alpha)
The goal of the Neotext project is to provide authors with tools that allow their readers to make an informed evaluation of sources, and to inspire greater confidence in the works of authors who are willing to hold themselves to a higher standard of accountability.
Medium could be the first platform to adopt this digital writing feature, if it:
If you don’t work at Medium.com and want to help on the WordPress plugin, I’m looking for:
If you’re interested in staying up to date with the latest news, join my email newsletter or subscribe to the RSS feed.
The plugin I created takes advantage of standard html’s blockquote cite attribute, where the url is specified with the “cite” attribute. Platforms like Medium would need to modify their editors to allow the author to specify a URL with their blockquotes:
Related: Download Sample HTML template
Contact me if you’re interested in helping with the programming or UI, or check out the project website or Github code.
This Jefferson quotation can be found in multiple sources, including the Jefferson Memorial, but if we draw in the full context of the quote from page 68 of Jefferson’s 1821 autobiography, we find that additional context reveals a more complicated history.↩
Distrust is also caused by conservative outlets continual promotions portraying the competition as unfair and biased.↩
by Tim Langeman
A new version of the WordPress Plugin is out which uses HTTPS to communicate with the Amazon Cloud.
This is an important upgrade because sites that run HTTPS require all javascript components to be delivered securely as well.
by Tim Langeman
Last month I posted an article I wrote about how Fake News, in the form of Out-of-Context Quotations, is not a new issue, but the internet poses both challanges and opportunities.
I described 2 examples of how media sources in 1999 and 2008 took quotations by Al Gore and Sarah Palin out of context and how this could be prevented if media outlets used Neotext.
I also expanded the concept of quote context to video and laid out 3 challenges to technologists.
by Tim Langeman
Here’s a progress report on where things stand.
I’ve been working on the webservice code to try to fix some problems with how the quote hash values are created. Right now the hash is computed in two places – by the client, using javascript, and by the python web service.
Most of the time both sets of code produce the same result, but there are some cases when they do not, which prevents the client from finding the generated json file.
Before I go too much further, I’d also like to switch the hash algorithm from sha1 to something like sha256. This should be a fairly simple switch because I had planned for the hash algorithm to be swappable, without breaking backwards compatibility.
by Tim Langeman
I got my first user feedback on the wordpress plugin from my friend Daniel Miller. In the interests of documenting feedback here is Daniel’s feedback:
Feature: Add neotext to the Preview/Publish system so that the user doesn’t have to visit api.CiteIt.net and submit a new url. I think this is possible, I just have to research the WordPress API. One optimization I’d also like to do is detect whether a “cite” attribute has been used so the webservice doesn’t run unnecessarily.
Bug: Fix the extra non html code that gets included in the text:
The expandable before-context of the quotation I pulled from neotext.net starts with “.. /analytics.js’,’ga’); ga(‘create’, ‘UA-65403609-1’, ‘auto’); ga(‘send’, ‘pageview’);
Feature: enable the editing (and possible formatting) of the quote
Question: Say I neotextify a page and then later I decide I don’t want the expanding quote contexts in that article. Is there an easy non-technical way I can turn off the neotext plugin for just one page in WordPress?
Answer: You can remove the neotext by removing the “cite” attribute from the html. Perhaps there could be a way to do this with the GUI.
Bug: The headings in the context for my quote are run together with no space between the surrounding text and the heading.
Possible solution: This appears to be the result of line wraps that don’t contain a space at the end of them. This will require a different way of generating the text-version of the html
Feature: Is there supposed to be a link to the original document in the top expandable section? Or maybe at the bottom of the bottom expandable section? Seems like that might be a nice feature, although maybe some people wouldn’t want that, so maybe simpler is better? (thinking out loud here)
Feature: Some of the javascript files included by the plugin are not minified. Would be nice if they were.
More Feedback:
If you have further suggestions send them my way. My gmail account is timlangeman. If I like them, I’ll have you add them to the Github issues pages.
by Tim Langeman
I setup a demo installation of wordpress that you can try:
The login is:
username: demo password: demo Read instructions