Browsing Departments by Author "Hardt, Daniel"
Now showing items 1-3 of 3
-
Hardt, Daniel; Elming, Jakob (Frederiksberg, 2010)[More information][Less information]
Abstract: A method is presented for incremental retraining of an SMT system, in which a local phrase table is created and incrementally updated as a file is translated and post-edited. It is shown that translation data from within the same file has higher value than other domain-specific data. In two technical domains, within-file data increases BLEU score by several full points. Furthermore, a strong recency effect is documented; nearby data within the file has greater value than more distant data. It is also shown that the value of translation data is strongly correlated with a metric defined over new occurrences of ngrams. Finally, it is argued that the incremental re-training prototype could serve as the basis for a practical system which could be interactively updated in real time in a post-editing setting. Based on the results here, such an interactive system has the potential to dramatically improve translation quality. URI: http://hdl.handle.net/10398/8272 Files in this item: 1
Hardt_Elming.pdf (201.1Kb) -
Mikkelsen, Line; Hardt, Daniel; Ørsnes, Bjarne (Frederiksberg, 2011)[More information][Less information]
Abstract: Overt VP anaphors like do so, do it and do the same can host a following PP (Culicover & Jackendoff (2005:285–6), Huddleston & Pullum (2002:1533), Miller (2011:5–6), Sobin (2008:150, 155–157)): (1) The House is set to take up the final version of the funding bill tomorrow. The Senate will do the same on Thursday. [COCA] (2) You have jilted two previous fiances and I expect you would do the same to me. [COCA] Using (1) to fix terminology, the ANAPHOR is do the same, the ANTECEDENT is take up the final version of the funding bill, the ORPHAN is on Tuesday, and the CORRELATE is tomorrow. Examples like (2) are of particular interest because the correlate (two previous fiances) is inside the antecedent and, consequently, the orphan and the antecedent must interact to produce the interpretation of the clause containing the anaphor. In order to arrive at the interpretation ‘you would jilt me’, the me of the orphan must take the place of two previous fiances inside the antecedent VP. A superficially similar situation arises with remnants of ellipsis, including pseudogapping (3), sluicing (4), and fragment answers (5). In each case, the interpretation of the ellipsis clause combines part of the antecedent with all or part of the remnant. (3) I wouldn’t say that to my mother, but I would to you. (4) I know he gave the dresser away, but I don’t know to who. (5) Q: Who did he give the dresser to? A: To me. URI: http://hdl.handle.net/10398/8469 Files in this item: 1
mikkelsen_hardt_oersnes_2011.pdf (136.1Kb) -
An Investigation of the Expression and Rating of SentimentHardt, Daniel; Wulff, Julie (Frederiksberg, 2012)[More information][Less information]
Abstract: Do user populations differ systematically in the way they express and rate sentiment? We use large collections of Danish and U.S. film reviews to investigate this question, and we find evidence of important systematic differences: first, positive ratings are far more common in the U.S. data than in the Danish data. Second, highly positive terms occur far more frequently in the U.S. data. Finally, Danish reviewers tend to under-rate their own positive reviews compared to U.S. reviewers. This has potentially far-reaching implications for the interpretation of user ratings, the use of which has exploded in recent years. URI: http://hdl.handle.net/10398/8606 Files in this item: 1
hardt_wulff.pdf (533.8Kb)
Now showing items 1-3 of 3