{"id":4784,"date":"2019-11-13T17:39:06","date_gmt":"2019-11-13T17:39:06","guid":{"rendered":"https:\/\/transfer.writingcommons.org\/?post_type=article&#038;p=4784"},"modified":"2024-06-21T20:14:43","modified_gmt":"2024-06-21T19:14:43","slug":"a-birds-eye-view-of-writing-corpus-linguistic-analysis","status":"publish","type":"article","link":"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/","title":{"rendered":"Corpus Linguistic Analysis &#8211; A Bird\u2019s Eye View of Writing"},"content":{"rendered":"<div class=\"tts_content_wrapper_2\" >\n<p>Related Concepts: <a href=\"https:\/\/writingcommons.org\/section\/research\/research-methodology\/\" title=\"Research Methodology\">Research Methodology<\/a>; <a href=\"https:\/\/writingcommons.org\/section\/research\/research-methods\/textual-methods\/rhetorical-analysis\/\" title=\"Rhetorical Analysis\">Rhetorical Analysis<\/a>; <a href=\"https:\/\/writingcommons.org\/section\/research\/research-methods\/textual-methods\/\" title=\"Textual Research Methods\">Textual Research Methods<\/a><\/p>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_69_1 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 eztoc-toggle-hide-by-default' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/#What_is_Corpus_Linguistics_Analysis\" title=\"What is Corpus Linguistics Analysis?\">What is Corpus Linguistics Analysis?<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/#How_we_usually_read_and_write\" title=\"How we (usually)\nread and write\">How we (usually)\nread and write<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/#Corpus_Linguistic_Analysis\" title=\"Corpus\nLinguistic Analysis\">Corpus\nLinguistic Analysis<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/#The_Birds-Eye_View_of_Language_Why_Corpus_Linguistic_Analysis\" title=\"The Bird\u2019s-Eye View of Language: Why Corpus Linguistic Analysis?\">The Bird\u2019s-Eye View of Language: Why Corpus Linguistic Analysis?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/#Summing_up\" title=\"Summing up\">Summing up<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/#Corpus_Resources\" title=\"Corpus Resources\">Corpus Resources<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/writingcommons.org\/article\/a-birds-eye-view-of-writing-corpus-linguistic-analysis\/#Further_reading\" title=\"Further reading\">Further reading<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"What_is_Corpus_Linguistics_Analysis\"><\/span>What is Corpus Linguistics Analysis?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"How_we_usually_read_and_write\"><\/span><strong>How we (usually)\nread and write<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>If you are like most people in the United States, you read and\nwrite one phrase, sentence, and paragraph at a time. Then, you consider all the words, sentences, and paragraphs of\na full individual text, and that tells you what that text is about. <\/p>\n\n\n\n<p>For example, when you read the news, you\nprobably read or skim each news article or post from the beginning onward, and\nthen you think about what each one is about. &nbsp;For a class or your own purposes, you might\nalso consider the audience of a particular article, such as whether it is international\nor domestic, or left-leaning or right-leaning. This kind of attention to the\nrhetoric and rhetorical situation of individual texts is something you have probably\npracticed a good deal. <\/p>\n\n\n\n<p>Reading one sentence and text at a time\nis what your teachers tend to do when they read papers, too: they read your\npaper from start to finish, and then they read your classmate\u2019s paper, and so\non.<\/p>\n\n\n\n<p>You and your instructors may also think\nabout some aspects of writing <em>across<\/em> individual texts, such as genre or\npurpose. Your teachers might look across a stack of papers, for instance,\nand consider how well a class of students has used primary evidence in a\nresearch paper. In another example, you might look over a Twitter feed to see how\noften people retweet posts in a particular thread. In such instances, you and your\nteachers are paying attention to aspects of the rhetorical situation across\nmultiple texts. <\/p>\n\n\n\n<p>By contrast, you probably spend little time\nthinking about how <em>language<\/em>\u2014in words, phrases, and sentences\u2014is used <em>across<\/em>\nthe texts you read and write. That kind of focus, on language across texts, is\ncommon in linguistic approaches to writing, which are more popular outside of\nthe U.S. than inside the U.S. Accordingly, if your writing teachers have been\ntrained in U.S. rhetoric and composition rather than linguistics, they know a\nlot about students\u2019 writing generally but may not know a lot about the specific\nlanguage that students use across their papers and across courses. <\/p>\n\n\n\n<p>What does all this mean? Most U.S. readers and writers, and most U.S.\nstudent writing research, tends to discuss written texts one text at a time. Understanding\n<em>across<\/em> texts tends to focus on <em>contextual <\/em>patterns, such as\naudience or genre. Most U.S. readers and writers know less about <em>textual<\/em>\npatterns, or patterns of language across texts and contexts. <\/p>\n\n\n\n<p>Of course, on some level, you do think about language patterns, maybe without even\nrealizing it. It\u2019s part of why you can recognize a newspaper article and why\nyou know how to write a text message: you have paid attention to how people use\nlanguage in patterned ways. But this kind of knowledge\u2014the kind we pick\nup through casual observation\u2014is often subconscious and is rarely systematic. For\nexample, you can probably write a text message that is appropriate for a given rhetorical\nsituation without thinking much about it, because you have picked up on what\nkind of language is appropriate for the genre (text message) and audience (your\nrecipient, such as a family member or friend). But what do you do when you need\nto write something unfamiliar to you? If you are writing your first college composition\nessay, or your first psychology case study, how do you know what language patterns\nare preferred?<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Corpus_Linguistic_Analysis\"><\/span><strong>Corpus\nLinguistic Analysis <\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>This brings us to analysis that uses computer-aided tools to offer\nus a view of language patterns across texts\u2014a bird\u2019s eye view of written language\npatterns. This kind of analysis is called corpus linguistic analysis: the term <strong>corpus<\/strong>\nrefers to a body of texts, and <strong>linguistic analysis<\/strong>, as you saw before,\nrefers to the examination of patterns of language use. As a complement to understanding\none text at a time, corpus linguistic analysis can help us systematically\nanalyze and understand written language in terms of patterns across many texts\nand across time. <\/p>\n\n\n\n<p>Reading so far, you may already be picking up on three premises,\nor assumptions, related to corpus linguistics: <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Texts\nmake meaning in patterned ways across texts and contexts. <\/li>\n\n\n\n<li>It\ncan be hard to comprehend language patterns if we are trained to read and\nanalyze only one text at a time. <\/li>\n\n\n\n<li>Attention\nto language across texts and contexts can teach us additional information about\nwhat is expected in particular rhetorical situations. <\/li>\n<\/ul>\n\n\n\n<p>You are probably already picking up on a detailed definition of\ncorpus linguistic analysis, too. &nbsp;<strong>Corpus\nlinguistic analysis<\/strong> refers to <em>the examination of textual patterns in a selected body of naturally produced texts, usually via computer-aided tools that\nfacilitate searching, sorting, and calculating large-scale textual patterns. <\/em><\/p>\n\n\n\n<p>Notice two key terms inside this definition:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><em>Textual\npatterns<\/em><\/strong><em>: <\/em>lexical or grammatical patterns that persist across texts in a\ncorpus, in contrast to more varied choices or to patterns in other corpora<\/li>\n\n\n\n<li><strong><em>Naturally\nproduced texts<\/em><\/strong><em>: <\/em>a given corpus consists only of language produced for authentic,\nreal- world purposes<\/li>\n<\/ul>\n\n\n\n<p>In sum, corpus linguistic analysis is about identifying choices\npeople make (and don\u2019t make) across texts, and we can use the results of such\nanalysis to enhance our understanding of how language and texts work. Corpus\nlinguistic analysis has been used a lot since the mid- to late-20<sup>th<\/sup>\ncentury, especially outside of the U.S., in places like England, Asia, and\nAustralia, to help teachers and students learn about expert and student writing\nchoices that come up again and again.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Birds-Eye_View_of_Language_Why_Corpus_Linguistic_Analysis\"><\/span><strong>The Bird\u2019s-Eye View of Language: Why Corpus Linguistic Analysis?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>You may not be convinced yet. If we are\nmost used to reading and writing one text at a time, why introduce something\ndifferent? Why get a bird\u2019s eye view of language patterns across texts? <\/p>\n\n\n\n<p>Some good reasons include that we get to\nsee different details when we look across texts\u2014details we can miss or\nmisperceive when we read one text at a time. Here are two key reasons why\ncorpus linguistic analysis can be useful, followed by examples from corpus\nlinguistic analysis of academic writing.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Our perceptions of language use are\noften misleading<\/strong>. <\/li>\n<\/ul>\n\n\n\n<p>It\u2019s easy to come to inaccurate\nconclusions about language, because some things catch our attention more than\nothers. For instance, people tend to think that language is changing rapidly\nwhen they read slang words on the Internet. But actually, there are many more\nwords on the Internet that have been around a long time than there are new\nwords. Corpus linguistic analysis has shown that only around 3% of online\nlanguage use includes internet-specific slang such as abbreviations. It\u2019s just\nthat the newer words grab our attention more than the old ones. In this\nexample, corpus linguistic analysis helps us quantify what percentage of words\non the internet are actually new words, and what percentage are words we have\nbeen using for a while. Let\u2019s\nconsider one more example, this one from research on academic writing<strong>.<\/strong><\/p>\n\n\n\n<p><strong>Have you ever found it difficult\nto read college textbooks? <\/strong>Doug Biber and his research team used\ncorpus linguistic analysis to analyze different kinds of language use on\ncollege campuses, including research articles, textbooks, and office hours. One\nthing they wanted to investigate was how textbooks compared to these other\nkinds of language use, because instructors often think that textbooks provide\neasy-to-read narrative descriptions for students. <\/p>\n\n\n\n<p>Based on corpus linguistic\nanalysis of all of these kinds of language, Biber et al. found that textbooks\nare not characterized by narrative, accessible language like spoken\nconversation. Instead, they tend to include dense, present-tense discussions of\nimplications, making textbooks challenging to read for students. In some ways,\ntextbooks are just as difficult to parse as research articles. <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Much of our knowledge about\nwritten language is tacit, or unconscious<\/strong> (Odell et al.).<\/li>\n<\/ul>\n\n\n\n<p>Once we have learned to write in a\nparticular way, it is easy to forget the conscious steps we had to learn to do\nit in the first place. That is why it can be hard for your teachers to realize\nwhat might be challenging about an academic writing task they assign, and why\nit might be hard for you to explain to a grandparent how to write a tweet or\nhow to use hashtags. Let\u2019s again turn to a more specific example from research\non academic writing.<\/p>\n\n\n\n<p><strong>Have you ever felt like you didn\u2019t<br>know what a teacher wanted in your writing?<\/strong> What teachers want can be subtle, or even unstated. Brown and Aull did a corpus analysis of advanced placement English essays that showed two distinct patterns in successful and unsuccessful essays. The successful student writing included specific, detailed phrases, while unsuccessful student writing included generic, emphatic phrases. This means, for instance, that a successful student essay might include the following sentence:<\/p>\n\n\n\n<p><strong>A\ntwentieth-century understanding of grief<\/strong><strong>\nsuggests that it takes time<\/strong>. <\/p>\n\n\n\n<p>In this sentence, a detailed\nphrase about an understanding of grief (underlined in the example) is the\nsubject of the sentence. <\/p>\n\n\n\n<p>By contrast, an unsuccessful\nstudent essay might instead say: <\/p>\n\n\n\n<p><strong>Grief<\/strong><strong>\nobviously takes time<\/strong>. <\/p>\n\n\n\n<p>This sentence includes a simple\nsubject (<em>grief<\/em>) as well as an emphatic word <em>obviously<\/em>.To\nacademic readers, the second sentence can seem too general and too strong.<\/p>\n\n\n\n<p>The bottom line is that our\nperceptions of language use can miss important patterns, because we tend to\nread one word, sentence, and text at a time. Getting a bird\u2019s-eye view allows\nus to understand more about the kinds of choices people tend to make with\nlanguage, including successful and unsuccessful choices in academic writing. As\nwe learn about such patterns and practice looking for them, we can become more\nadept at recognizing what characterizes different kinds of written texts. <\/p>\n\n\n\n<p><strong>Example\nexercise: Words that hang out with one another <\/strong><\/p>\n\n\n\n<p>Let\u2019s get some practice thinking about language patterns. We\u2019ll do\nthis by considering <strong>collocations<\/strong>, or\nthe words that most often hang out with other words. (The technical,\nfancy-sounding definition of collocations is \u201cthe habitual juxtaposition of a\nparticular word with another word or words with a frequency greater than\nchance.\u201d)<\/p>\n\n\n\n<p>First, try to guess: What words collocate, or hang out, most often\nwith the word <em>idea <\/em>in U.S. English? <\/p>\n\n\n\n<p>Specifically, what words do you think come just before <em>idea<\/em>,\nin all sorts of U.S. English (spoken, fiction, academic, news, and magazine)? List\nyour top 5 guesses. <\/p>\n\n\n\n<p>________________ <em>idea<\/em><\/p>\n\n\n\n<p>________________ <em>idea<\/em><\/p>\n\n\n\n<p>________________ <em>idea<\/em><\/p>\n\n\n\n<p>________________ <em>idea<\/em><\/p>\n\n\n\n<p>________________ <em>idea<\/em><\/p>\n\n\n\n<p>To test your guesses, we can turn to corpus linguistic analysis,\nusing the Corpus of Contemporary American English (COCA). COCA is an online\ndatabase where you can search all kinds of patterns in American English, across\nspoken conversation, fiction, academic writing, news, and magazines. You\u2019ll see\nCOCA listed in the resources below with a URL so that you can check it out\nyourself.<\/p>\n\n\n\n<p>For this search, we\u2019ll look for all words immediately to the left of\nidea. These are called <em>1L<\/em> collocates, because they appear 1 space to the\nleft<em>.<\/em><\/p>\n\n\n\n<p>Use of the word <strong>IDEA<\/strong> in\nCOCA (all registers)<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>\n  <strong>Top 10\n  1L Collocates <\/strong>\n  <\/td><td>\n  <strong>&nbsp;<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>good<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>bad<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>whole<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>great<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>better<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>new<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>very <\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>basic<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>clear<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>general<\/strong><strong><\/strong>\n  <\/td><td>\n  <strong>idea<\/strong>\n  <\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>How many of your guesses were right? Did you guess that not only\nare <em>good idea <\/em>and <em>bad idea<\/em> popular, but so too are the\nexpressions <em>(the) very idea, basic idea, <\/em>and<em> general idea?<\/em> <\/p>\n\n\n\n<p>Let\u2019s think about these patterns. Several collocations show <em>evaluation\n<\/em>of an idea (<em>good idea, bad idea, great idea<\/em>), including some\ncomparison (<em>better idea, new idea<\/em>). Others show <em>emphasis<\/em> on an\nidea (<em>(the) very idea<\/em>). Finally, others convey a summary or gist of an\nidea (<em>whole idea, basic idea, general idea<\/em>). (<em>Clear idea <\/em>is used\nboth in evaluation and in summary statements.) <\/p>\n\n\n\n<p>Many people guess that people describe ideas as <em>good <\/em>and <em>bad<\/em>,\nbut they don\u2019t realize how often speakers and writers use <em>idea <\/em>to let\ntheir audience know that they are summarizing something. As you read before, this\nis the kind of thing that corpus linguistic analysis can uncover: common\npatterns of language use that we don\u2019t necessarily pay attention to but that\ncan tell us what matters to people in a given type of writing. Picking up on\nthese collocates might, for instance, help students begin to notice how often\npeople summarize, and when they tend to do so.<\/p>\n\n\n\n<p>If we use the above examples, for instance, you could consider the\nfollowing as you begin to read and write in a new course: How do writers\ndescribe ideas? Do they evaluate them (e.g., as <em>good, bad, <\/em>or <em>correct<\/em>)?\nDo they describe them (e.g., as <em>theoretical, abstract,<\/em> or <em>practical<\/em>)?\nDo they summarize them (e.g., <em>general, overall<\/em>)?<\/p>\n\n\n\n<p>Let\u2019s explore one more example, this one concerning something many\nstudents wonder about: the first person in academic writing.<\/p>\n\n\n\n<p>Here\u2019s our question for this one: How do writers draw attention to\nthemselves as writers by using the first person <em>I<\/em> or <em>we<\/em>?<\/p>\n\n\n\n<p>Let\u2019s first make a guess about expert academic writing. In academic writing published in the U.S., what words do you think collocate, or hang out, with <em>I<\/em>? Specifically, what words do you think most often appear right after <em>I,<\/em> or immediately to the <strong>right<\/strong> of the word <em>I<\/em>, in academic writing? Again, note your top 5 guesses.<\/p>\n\n\n\n<p><em>I <\/em>________________\n<\/p>\n\n\n\n<p><em>I <\/em>________________\n<\/p>\n\n\n\n<p><em>I <\/em>________________\n<\/p>\n\n\n\n<p><em>I <\/em>________________\n<\/p>\n\n\n\n<p><em>I <\/em>________________\n<\/p>\n\n\n\n<p>We can again use corpus linguistic analysis to find out how\naccurate your guesses are. Specifically, we can use the Corpus of Contemporary\nAmerican English academic subcorpus (COCAA) and search for words &nbsp;1 space to the right, or 1R, of <em>I.<\/em><\/p>\n\n\n\n<p>Use of the word <em>I <\/em>in COCA, Academic writing<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>\n  <strong>&nbsp;<\/strong>\n  <\/td><td>\n  <strong>Top 10 1R Collocates<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>&nbsp;have&nbsp; <\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>was<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>think<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>had<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>would<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>will<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>can<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>could<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>did<\/strong>\n  <\/td><\/tr><tr><td>\n  <strong>I<\/strong>\n  <\/td><td>\n  <strong>believe<\/strong>\n  <\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>First of all, using COCAA, we can see that even though lots of\nstudents have heard that they shouldn\u2019t use <em>I <\/em>in academic writing, corpus\nlinguistic analysis shows us that many published academic writers use <em>I, <\/em>or\n<em>we<\/em>. <\/p>\n\n\n\n<p>How do they use it? In these collocates, we can see a clear and\nconsistent pattern: academic writers use <em>I<\/em> as the subject of verbs, and\nthese verbs tend to help writers describe their processes; consider, for\ninstance, examples like <strong><em>I have<\/em><\/strong><em> observed, <strong>I was<\/strong> able to,\n<strong>I had<\/strong> collected<\/em>). Academic writers also use <em>I<\/em> to describe their\nthinking (<strong><em>I think<\/em> <\/strong>that<em>, <strong>I would<\/strong> suggest<\/em>). They\nalso, though less often, use <em>I <\/em>to describe beliefs: <em>I believe <\/em>is\nthe final of the last of the top ten. <\/p>\n\n\n\n<p>How did your guesses hold up? A lot of people guess <em>argue, <\/em>thinking\nthat academics write <em>I argue <\/em>a lot, but it is not in the top ten.\nConversely, few people guess <em>I have <\/em>or <em>I had. <\/em>In addition, many\nstudents are surprised to see that academic writers are often tentative rather than\nexplicit about their arguments: as you can see, academic writers use <em>I\nwould, I think, <\/em>and <em>I could <\/em>far more often than <em>I argue.<\/em><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Summing_up\"><\/span><strong>Summing up<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>As you can see, sometimes corpus linguistic analysis can surprise\nus. It shows us that textbooks can be hard to read, that student grades are\nbased in part on the subjects of their sentences, and that academic writers use\n<em>I<\/em> to describe steps in their thinking and processes. With more analysis,\nwe learn more. <\/p>\n\n\n\n<p>Try out the resources below, and see what patterns you find with a\nbird\u2019s eye view across many texts.&nbsp; <\/p>\n\n\n\n<p><strong>More examples\nof corpus linguistics research<\/strong><\/p>\n\n\n\n<p><em>Written\nversus spoken English:<\/em><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Very\nformal, academic writing tends to contain lots of nouns and prepositions, while\nmore informal language, including spoken conversation, tends to contain more\npronouns and verbs (Biber; Biber and Gray). <\/li>\n<\/ul>\n\n\n\n<p><em>Student\nwriting:<\/em><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Successful\nwriting by late-undergraduate and early-graduate writers show clear differences\ndepending on the discipline. For example, writing in Philosophy and Education is\nmore narrative and interpersonal than writing in Biology or Physics. Writing in\nPolitical Science and Linguistics falls in between (Hardy and R\u00f6mer). <\/li>\n\n\n\n<li>First-Year\ncollege writers tend to boost, or intensify their ideas with words such as <em>really, truly, <\/em>or <em>clearly,<\/em> more than they hedge or qualify their ideas, with words\nsuch as <em>perhaps, might, <\/em>or <em>possibly<\/em>. This can make first-year\nwriting seem overstated to many academic readers, who tend to appreciate some\nspace for doubt and exception (Aull <em>First-Year;\n<\/em>Aull et al.; Aull and Lancaster; Hyland &#8220;Undergraduate\nUnderstandings&#8221;).\n<\/li>\n<\/ul>\n\n\n\n<p><em>Published academic\nwriting across disciplines:<\/em><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Writers\nin the social and natural sciences tend to use more first person pronouns (<em>I, we<\/em>) to describe experimental\nprocesses, while writers in the humanities tend to use first person pronouns to\nshowcase interpretive reasoning (Hyland &#8220;Stance&#8221;).<\/li>\n\n\n\n<li>Academic\nwriters across all disciplines still tend to hedge, or qualify, more than they\nboost, or intensify (Hyland <em>Disciplinary\nDiscourses<\/em>).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Corpus_Resources\"><\/span><strong>Corpus Resources<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p><em>Corpus of Contemporary American English\n(COCA):<\/em><strong> <\/strong><a href=\"https:\/\/www.english-corpora.org\/coca\/\"><strong>https:\/\/www.english-corpora.org\/coca\/<\/strong><\/a><strong> <\/strong><\/p>\n\n\n\n<p>Details about COCA: Davies,\nM. (2011). Word frequency data from the Corpus of Contemporary American English\n(COCA).<\/p>\n\n\n\n<p><em>Michigan Corpus of Upper-Level Student Papers (MICUSP)<\/em><em>:<\/em>  <\/p>\n\n\n\n<p>Details about MICSUP:R\u00f6mer, Ute\nand O&#8217;Donnell, Matthew. From student hard drive to web corpus (part 1): the\ndesign, compilation and genre classification of the Michigan Corpus of\nUpper-level Student Papers (MICUSP). <em>Corpora<\/em>, vol. 6, no. 2, 2011:\n159-177.<\/p>\n\n\n\n<p><em>Collocation games<\/em>, see e.g.,\nWu, Franken, and Witten. Collocation games from a language corpus. In <em>Digital\nGames in Language Learning and Teaching<\/em>. Palgrave Macmillan, London, 2012:\n209-229.<\/p>\n\n\n\n<p><em>The Grammar Lab<\/em>: David\nWest Brown\u2019s <a href=\"http:\/\/www.thegrammarlab.com\/\">www.thegrammarlab.com\/<\/a> &nbsp;&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Further_reading\"><\/span><strong>Further reading<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n\n\n\n<p>Corpus linguistic analysis can be particularly valuable for\nidentifying student-specific discourse (R\u00f6mer and Wulff)<\/p>\n\n\n\n<p>Textual patterns with attention to discipline\/ genre\/\nassignment\/ level\/ course<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Multiple features combine to\ncreate coherent styles, such as more a persuasive or more formal style, that\nare equally successful even for the same task (Crossley et al.)<\/li>\n\n\n\n<li>Some academic genres and fields (e.g., argumentative essays;\nhumanities) tend to include more features characteristic of informational writing (e.g., nouns and\nprepositions) <\/li>\n\n\n\n<li>Others (e.g., reports and natural sciences) include features more\ncharacteristic of interpersonal writing\n(e.g., adverbs and pronouns) <\/li>\n\n\n\n<li>(Aull &#8220;Argumentative Versus Explanatory\nDiscourse&#8221;; Hardy and R\u00f6mer; Nesi and Gardner)<\/li>\n<\/ul>\n\n\n\n<p>Textual patterns with attention to genre\/ assignment\/ level\/ course<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Students\nmay develop vis-\u00e0-vis how they cite, engage with, and project others\u2019 views (\u00c4del and Garretson;\nCoffin; Coffin and Hewings)<\/li>\n\n\n\n<li>As\nundergraduate students develop, they hedge\nmore and boost less, and they begin to use certain cohesive strategies more (Aull and Lancaster)<\/li>\n\n\n\n<li>Successful\nadvanced student writing includes nouns that are metadiscoursal and methodology-related (Hardy and R\u00f6mer),\nversus more generic nouns, such as <em>people <\/em>or <em>society<\/em>, that are\nkey in first- year writing (Aull et al.)<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n<\/div>","protected":false},"author":2,"featured_media":60256,"parent":0,"menu_order":0,"template":"","tags":[],"chapters":[2268,1914],"content_type":[],"class_list":["post-4784","article","type-article","status-publish","has-post-thumbnail","hentry","chapters-linguistic-analysis","chapters-textual-methods"],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/article\/4784","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/article"}],"about":[{"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/types\/article"}],"author":[{"embeddable":true,"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/users\/2"}],"version-history":[{"count":21,"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/article\/4784\/revisions"}],"predecessor-version":[{"id":65147,"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/article\/4784\/revisions\/65147"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/media\/60256"}],"wp:attachment":[{"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/media?parent=4784"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/tags?post=4784"},{"taxonomy":"chapters","embeddable":true,"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/chapters?post=4784"},{"taxonomy":"content_type","embeddable":true,"href":"https:\/\/writingcommons.org\/wp-json\/wp\/v2\/content_type?post=4784"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}