“There is no fear of mistake. If the one is true the other is true.”

Walt Whitman Photograph #106
When I first arrived at the Center for Digital Research in the Humanities, I wrote somewhat rapturously of the Center as a fecund site of interdisciplinarity, cradling librarians, computer experts, and humanists of all stripes. While the summer has sped past, my wonder at this fertility, at the consistency and ease of engagement between and among scholars, has not diminished. One of the real joys of working at the Center this summer has been the conversations with those in the university who are occupied with diverse problems and concerns and seem eager to reach beyond themselves, and beyond their disciplines or departments, to solve them. This sort of interdisciplinarity demands humility. It insists that we don’t ourselves know everything, that we cannot locate the solution to every challenge in our personal tool belts. Continue reading

Escape from Positivism

"Willa Cather and Isabelle McClung Hambourg," 1923.“…the sum of emphasized words to the whole number is 449:865, or not far from 1:2.  In the prose passage from Carlyle there is more than seventy per cent of emphasis, but the force-ratio of the present paragraph and the next is 25:45, or only fifty-five per cent.”
-L.A. Sherman, Analytics of Literature, 1893, p. 18. Quoted in Slote, 18.

Last week, Kay Walter, listening to my struggles with the complex mathematic of authorship attribution, suggested that I might want to sit down with Kari Ronning, who has worked on the Cather Journalism Project and has long been involved with preparing scholarly editions of Willa Cather’s texts, to discuss traditional methods of attribution. Continue reading

How Complex the Mathematic!

"Jan Hambourg and Willa Cather swimming in the south of France," 1920. Duquesne University mathematician and computer scientist Patrick Juola, who devised the JGAAP authorship attribution tool, is dedicated to making complex text analysis techniques accessible to scholars outside of statistics departments.  In a 2008 Digital Humanities conference abstract entitled “Authorship Attribution for the Rest of Us,” he conceded that “the statistics necessary for performing [attribution] can be onerous and mathematically formidable.  For example, a commonly used analysis method, Principle Component Analysis (PCA), requires the calculation of ‘the eigenvectors of the covariance matrix with the largest eigenvalues,’ a phrase not easily distinguishable from Star Trek technobabble” (250).  JGAAP seeks to “hide this complexity from the user” with a friendly framework and interface. Continue reading

Distracting from Cather

"Group 6 Women, Willa Cather 2nd from right."

“The first factor to consider in an authorship question is the number of candidates involved.”
-Hugh Craig, “Stylistic Analysis and Authorship Studies,” in A Companion to Digital Humanities (2004).

The world was my oyster, the path forward clear and bright.  Armed with access to the JGAAP tool, “a modular framework for authorship attribution using the object-oriented capacities of the Java programming language” (Juola 2008), and text files of articles that appeared in Home Monthly circa 1896-1897, some of which were published under Willa Cather’s byline and others anonymously, I sat down to generate and triangulate statistical data that might indicate authorship of the anonymous articles.  I carefully uploaded my Known Author and Unknown Author texts and ran them through variations of Events (e.g., Character Bigram, Word Length) and Statistical Analyses (e.g., Levenshtein Distance, Histogram Analysis). Continue reading

“Out of the dark confinement! out from behind the screen!”: Behind the Scenes at Whitman Camp

Thomas Eakins, "Old man, seven photographs," early- to mid-1880s.

The annual Whitman Camp, held last week in UNL’s Love Library, brought scholars from Nebraska, Iowa, Virginia, and Texas together to discuss their collective work on the Whitman Archive.  The professors and graduate students gathered in the first floor conference room reported on the progress they’d made over the past year and weighed in on the multitude of decisions that must be made as the Archive expands, shifts, and continually modernizes. Continue reading

Expeditions in Text

"White Salmon Trout," March 16, 1806.The Center was unusually quiet this past week, drained by the Digital Humanities conference in College Park and the International Cather Seminar in Chicago.  While Brian Pytlik Zillig, the Center’s Digital Initiatives Librarian and creator of TokenX, was talking to authorial attribution expert Patrick Juola in Maryland, I was continuing to read up on the promises and pitfalls of attribution techniques.

I also continued to learn about TokenX’s text comparison capabilities, under the tutelage of Government Documents Librarian Charlie Bernholz.  Charlie and I met to discuss the results generated by TokenX in assessing four American Indian treaties with double entries in the Statutes at Large.  When we ran these twinned treaties through TokenX, we were rewarded with similarity data based on the Levenshtein distance between pairs of tokens (for identical tokens, the distance is zero; for different tokens, the distance is equivalent to the number of changes needed to transform one token into another) in the documents.  Charlie further calculated mean edit distance per error rate and the normalized value of the edit distance. Continue reading

The Wild West of Text Analysis

"Willa Cather driving a handcar," c. 1898.“Criticism drifts into the language of mathematics.”
-Steve Ramsay, Algorithmic Criticism, 2008.

My first full week at the Center for Digital Research in the Humanities was dedicated to understanding and installing TokenX, Brian Pytlik Zillig’s multifaceted text analysis tool, and brainstorming about possible projects that might benefit from—and serve to strengthen—the tool.  Brian, in the past month or so, has added capabilities to TokenX 2.0, equipping the tool to quantitatively compare XML-encoded digital texts. Continue reading

CDRH: An Introduction

M.P. Rice, "Washington D.C. 1865—Walt Whitman & his rebel soldier friend Pete Doyle," 1869.The Center for Digital Research in the Humanities, based at the University of Nebraska-Lincoln, is a sort of wondrous place, committed to advancing and cultivating interdisciplinary research through the use of technology.  Since my arrival on Wednesday, June 10th, I have been introduced to the scholars who have edited and guided the Center’s vivid online archives and exhibits, including the Willa Cather and Walt Whitman online archives, witnessed Digital Initiatives and Special Collections Chair Kay Walter’s stunning expertise in UNL’s unique objects and scholarly pursuits, and encountered a wide range of projects that draw on the knowledge and interest of researchers across the university. Continue reading