Anda di halaman 1dari 15

Brian

Larson, Douglas Walls, William Hart-Davidson, Kenneth C. Walker, Ryan Omizo,

Thirteen scholars gathered as part of a hands-on workshop at RSAs 2015 summer


insLtute. Bill Hart-Davidson and Ryan Omizo led us in the workshop, Ltled Building
Sophware: Modeling TheoreLcal Approaches to Technical and Professional WriLng
with ComputaLonal Methods. Bill and Ryans vision for the session was that we
would rapidly build a prototype applicaLon that would use computaLonal methods
to explore an aspect of rhetorical genre theory.

Pictured: Casey Boyle, Steen Guenzel, Sally Henschel, David Kaufer, Suzanne Lane,
Timothy Laquintano, ChrisLne Stephenson. Not pictured: Heather Alexander.

corpus. The SNAP corpus has ****_______ reviews of products harvested from
Amazon.com. See Leskovec, J., & Sosic, R. (2014). Snap. py: SNAP for Python, a
general purpose network analysis and graph mining tool in Python; J. McAuley and
J. Leskovec. Hidden factors and hidden topics: understanding raLng dimensions
with review text. RecSys, 2013.
genre hybridity. See Selber, S. A. (2010). A rhetoric of electronic instrucLon sets.
Technical CommunicaLon Quarterly, 19(2), 95-117.
Previous research showed that among product reviews rated as most helpful by
users of the service, those containing experience informaLon accounts of
customers using the product were rated higher than those that did not. Skalicky,
S. (2013). Was this analysis helpful? A genre analysis of the Amazon. com discourse
community and its most helpful product reviews. Discourse, Context & Media,
2(2), 84-93.
We wondered whether computaLonal tools could reliably used to idenLfy Amazon
reviewsa genrethat included instrucLonal componentsevidence of another
genre hybridizing with the rst.

We adapted/adopted the agile/scrum development methodology to do rapid


research and development work at the same Lme the UX team was deciding what it
was we should be doing.

UX group adopted a consumer user story: As an online shopper, I want to hear


how others have experienced the product I have or am about to purchase in order
to understand 1) what Lps or advice others may have for eecLve use, 2) what
alerts others may oer to unwanted outcomes.
Basically CLICK: Tips, hacks, and warnings.
Images by CC license: Beware the fairies 2011 Alan Cleaver, hops://ic.kr/p/
aNbbmk. Summer ConstrucLon 001 (Detour) 2008 Penn State hops://ic.kr/p/
8xaUuK. Curves Mykl 2007 RovenLne, hops://ic.kr/p/8xaUuK.

Research team had to gure out whether humans could idenLfy Lps, hacks, and
warnings.
The SNAP corpus was already divided into sentences when we got hold of it.
We took a few thousand sentences and developed a coding guide through an
iteraLve process.

Research team had to gure out whether humans could idenLfy Lps, hacks, and
warnings.
The SNAP corpus was already divided into sentences when we got hold of it.
We took a few thousand sentences and developed a coding guide through an
iteraLve process.

That didnt make for the greatest Saturday night in Madison


We did not have Lme to do interrater reliability checks.
Image CC license: Bees 2014 Sy, hops://ic.kr/p/pcAtEH


Explain this.

SVM is a staLsLcal MLA. It uses known values to idenLfy a dividing line (known as a
hyperplane) that correctly divides the training instances and is at maximum distance
from them. Ask me during Q&A if you want me to try to explain this. (I was not on
dev team, but I used SVM classier in a research project of my own.)

10

Here is the prototype app.

11

Discuss this.

12

Ryan Omizo notes: The tesLng set was rather small, so the scores are probably a
liole too good. Also, F1, precision, and accuracy are more generous than other
metrics such as Cohen's Kappa, so we are starLng a bit ahead anyway. But the scores
are good for a 3 day prototype.

13

We set out to learn if a phenomenon like genre hybridity could be found in product
reviews as hypothesized by Selber [1] and suggested, albeit faintly, by Skalicky [14].
Working together, we learned that it could be found and, based on the results of the
RT, that it could be reliably found by humans. We also learned from the DT that it
was at least plausible that the signals for instrucLonal text are disLnct enough from
that of the persuasive components of reviews that we could train a machine-learning
algorithm to idenLfy these in unseen texts. Finally, we learned from the UXT that
nding the bits of instrucLonal texts in product reviews and presenLng them in a
disLnct view could be a useful service for consumers in its own right.

From Bill H-D: 1. TentaLvely, we would say that the repeLLve signals of instrucLonal
text were indeed strong enough in the corpus we worked with that we could train a
machine learning algorithm to reliably nd and disLnguish them from the persuasive
signals of a product review.
2. Would we nd genres embedded in and intermixed with others? Our subtracLve
synthesis shows that we do, indeed. How much, how ozen, and whether the more
"hybrid" reviews are more highly ranked and/or useful to readers remain quesLons
to be answered. But we have more reason to think we can answer them azer our
iniLal work.

14

Kenny Walker notes: We had a few lingering consideraLons with the coding guide.
In parLcular, . . .the interesLng variaLons we found on mediaLng the interacLon of
the reader/user (i.e., does the charge to be paLent mediate interacLon?) . . . . [T]he
variaLons on what consLtutes typied rhetorical acLon are key concerns for genre
hybridity.

15

Anda mungkin juga menyukai