Dinosaur Comics! ([syndicated profile] dinosaur_comics_feed) wrote2025-06-18 12:00 am

anyway the case involved a man in a locked room who had been shot, but it turned out the killer inve

archive - contact - sexy exciting merchandise - search - about
June 18th, 2025next

June 18th, 2025: This weekend I'll be in Utrecht for Heroes Dutch Comic Con - the biggest con in the Netherlands! I have never been to the Netherlands so please do send me all your SECRET NETHERLANDS RECOMMENDATIONS, and I hope to see you there!

– Ryan

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-18 01:14 pm

Eggcorn of the month

Posted by Mark Liberman

YouTube's speech-to-text system is way behind the state of the art, or maybe has a good sense of humor. From its transcription of Donald Trump's 5/15/2025 speech in Qatar (the whitehouse.gov version):

A few other (meta-usage) examples of "Pulit suprise" are Out There, but even an old-fashioned bigram language model would know that the right answer is "Pulitzer Prize" — so it's a puzzle why Google's (presumably) LLM-based model screws this up so badly.

And it makes the same choice in other recordings of the same speech, for example this one from Bloomberg:

And that recording's transcript has the same word sequence, but divides the transcript into lines differently — through still in a way that makes no sense, neither in terms of the message content nor in terms of its prosodic delivery. The large variation in line length removes the theory that the goal is a just a certain number of words or characters per line. So again, why this application of Google's language model is so (variably) crappy is a puzzle.

The word error rate is not especially large, but the system makes plenty of other weird choices as well. In its transcription of that particular speech, Trump refers (in a somewhat rambling way) to Sean Duffy. in his role as Secretary of Transportation and also as a former lumberjacking champion. The YouTube transcription of the whitehouse.gov version has his name spelled "Sean" six times and "Shawn" three times. The YouTube transcription of the Bloomberg version uses each spelling five times. (I'm not clear why the totals are different, and don't have time to look into it further — a reader may figure it out for us…)

And here the spelling choices are also slightly different:

Random trawling through YouTube transcripts, as I've done over the years, turns up lots of weird stuff — as one other example, both of the cited trancripts render references to C.C. Wei as "Mr. weey", with a lower-case initial letter as well as a weird spelling, even though the context should make it clear to any Artificial (un)Intelligence that Trump is talking about the head of TSMC.

Maybe somebody from Google can explain what's going on.

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-18 12:10 pm

Zipf genius

Posted by Victor Mair

I have always been deeply intrigued by George Kingsley Zipf (1902-1950), but Mark's recent "Dynamic Philology" (5/24/25) rekindled my interest.

Put simply,

He is the eponym of Zipf's law, which states that while only a few words are used very often, many or most are used rarely,

where Pn is the frequency of a word ranked nth and the exponent a is almost 1. This means that the second item occurs approximately 1/2 as often as the first, and the third item 1/3 as often as the first, and so on. Zipf's discovery of this law in 1935 was one of the first academic studies of word frequency.

Although he originally intended it as a model for linguistics, Zipf later generalized his law to other disciplines. In particular, he observed that the rank vs. frequency distribution of individual incomes in a unified nation approximates this law, and in his 1941 book, "National Unity and Disunity" he theorized that breaks in this "normal curve of income distribution" portend social pressure for change or revolution.

(Wiktionary)

Because of its applicability to other types of data than purely linguistic ones, I sometimes feel that Zipf unlocked a secret key to the universe, which is truly humbling.  What is even more astonishing is that Zipf did not like mathematics, whereas mathematics-physics is usually thought of as the ultimate approach to Unified Field Theory.  It would seem that Zipf discovered a strictly empirically based approach to cosmology.

BTW, I have habitually pronounced his striking surname as it is spelled, accounting for all four letters, but Wikipedia gives it as /ˈzɪf/ ZIFF; German pronunciation: [tsɪpf].  Zipf is "from late Middle High German zipf zipfel ‘point tip corner’ hence a topographic name for someone who occupied a narrow corner of land as for example between converging channels of a stream; or a nickname for someone who wore a pointed garment like a long hood."

Source: Dictionary of American Family Names 2nd edition, 2022, as cited here.

In Bavarian and Austrian German, Zipf m (strong, genitive Zipfes or Zipfs, plural Zipfe):  "tip, peak, corner".

Reminds me of the Cantonese geonym zeoi2 咀 ("spit [[narrow neck of land projecting into a body of water]", etc.), to be distinguished from the homonym-homophone zeoi2 咀 ("chew, masticate").

 

Selected readings

George Kingsley Zipf seems to have been an incredibly brilliant person. In addition to being Chairman of the German Department at Harvard, he was University Lecturer, a rare honor which meant that he could teach any subject he wanted. He died on September 25, 1950 at the age of 48 after a three-month illness. Yet, within that short life, not only did he discover Zipf's law, which has such important implications for linguistics, he applied similar models to human behavior (the principle of least resistance), frequency distribution of individual incomes and its implication for national unity and disunity, and other vital fields. It is said that his statistical insights can explain properties of the internet, even though he arrived at them before it was discovered.

I'm especially intrigued to learn that he worked with Chinese and wonder what he focused on in that regard.

All in all, a fascinating person. If there's not a biography of Zipf, he's ripe for one.

(Wikipedia)

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-17 12:41 pm

Pinyin Reading Materials

Posted by Victor Mair

[This is a guest post by Mok Ling]

I happen to know a few students (of varying ages and learning experiences) who want to learn (or re-learn, for some of them) Mandarin the "right" way (that is, focusing on speaking and listening before reading and writing, unlike what is prescribed by most HSK courses). Right now, I've got them chewing on the revised Pinyin edition of Princeton's Chinese Primer (which is in pure Pinyin — not a single sinograph until halfway into the course), but they obviously need something outside of a textbook to read.

I'd planned on giving them a Pinyinized Kong Yiji as a "goal text" to read once they have a firm command of the spoken language, but thinking back this seems like a bad idea because of how flowery Lu Xun can get.

My question is, are there any books I can give these students that are:
1. In sayable Chinese or 白話, NOT in the regular style of written Chinese (半文半白);
2. Interesting and distinct enough in style from the Primer.

My mind immediately went to Chao's Readings in Sayable Chinese (中國話的讀物), but I haven't been able to find ANY electronic copy thereof, much less a Pinyin edition. I also thought about the pure-Pinyin books printed for the ZT experiment but could not find any of the original materials — are those little storybooks still accessible?

As for online materials, the Pinyin Lit site you set up for the Pinyin Literature Contest has been very helpful, but I need something with a little more depth and length that I can go through with these learners.

VHM note:

From the time I started going to China in the early 80s, I tried to convince Chinese scholars, educators, and publishers of the great value and compelling need for the publication of pinyin reading materials of all types and at all levels.  I published a journal of romanized Chinese called Xin Tang.  I held an international contest of writing in pinyin in memory of my wife, Chang Li-ching, who collaborated with me on many pinyin projects, and, with the visionary assistance of Mark Swofford, published her memoirs in pinyin, and so on.  I have faith that, in the not too distant future, increasing amounts and kinds of pinyin reading materials will become available for those who are interested in them.

 

Selected readings

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-16 01:06 pm

Unicode CJK Unified Ideographs Extension J and the nature of the sinographic writing system

Posted by Victor Mair

Submitted by Charles Belov:

I've been browsing through the proposed Unicode 17 changes, currently undergoing a comment period, with interest. While I don't have the knowledge to intelligently comment on the proposals, it's good to see that they are actively improving language access.

I'm puzzled that some new characters have been added to the existing Unicode CJK Unified Ideographs Extension C (6 characters) and Unicode CJK Unified Ideographs Extension E (12 characters) rather than added to a new extension. But the most interesting is the apparently brand-new Unicode CJK Unified Ideographs Extension J, with over 4,000 added characters.

I found the following characters of special interest:

– 323B0 looks like the character 五 with the bottom stroke missing.
– 323B3 looks like an arrangement of three 三s – does it possibly mean the same as 九?
– 32501, while not up to the character for biang for complexity, is nevertheless quite a stroke pile: the 厂 radical enclosing a 3 by 3 array of the character 有
– 3261E is the character 乙 in a circle, which doesn't look quite right to me as a legit Chinese character
– 326FB seems sexist to me: three 男 over one 女
– 33143, similarly to 32501, has ⻌ enclosing a 3 by 3 array of the character 日

Alas, macOS does not yet support the biang character, so I can't include it in this email. Hopefully someday.

Character additions

VHM:

Note that, as it has been since the beginning of Unicode, CJK gobbles up the vast majority of all code points (see Mair and Liu 1991).

What is this fact telling us about the Chinese writing system, particularly in comparison with other writing systems?  How does one account for this disparity?  What is the meaning of this gross disparity?

The average number of strokes in a Chinese character is roughly 12.

The average number of strokes in a letter of the English alphabet is 1.9.

The average number of syllables in an English word is 1.66 (and 5 letters).

The average number of syllables in a Chinese word is roughly 2 (and 24 strokes).

The average number of words in an English sentence is 15-20.

The average number of words in a Chinese sentence is 25 (ballpark figure; see here)

Chinese has more than 100,000 characters.

English has 26 letters.

Total number of English words;  over 600,000 (Oxford English Dictionary)

Total number of Chinese words: a little over 370,000 (Hànyǔ dà cídiǎn 漢語大詞典 [Unabridged dictionary of Sinitic])

und so weiter

 

Selected readings

Dinosaur Comics! ([syndicated profile] dinosaur_comics_feed) wrote2025-06-16 12:00 am

it was a time when the eggs were large and the sucking of said eggs was both expected and experience

archive - contact - sexy exciting merchandise - search - about
June 16th, 2025next

June 16th, 2025: This weekend I'll be in Utrecht for Heroes Dutch Comic Con - the biggest con in the Netherlands! I have never been to the Netherlands so please do send me all your SECRET NETHERLANDS RECOMMENDATIONS, and I hope to see you there!

– Ryan

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-15 12:40 pm

Dungan radio broadcasts from 2018-2021

Posted by Victor Mair

We've talked about Dungan a lot on Language Log.  That's the northwest Sinitic topolect written in Cyrillic that has been transplanted to Central Asia.  See "Selected readings" below.

For those of you who are interested and would like to hear what it sounds like in real life — spoken and sung by male and female voices — we are fortunate to have a series of ten radio broadcast recordings (here).

Note the natural, easy, undistorted insertion of non-Sinitic borrowings, e.g., "Salam alaikum" (Arabic as-salāmu ʿalaykum  السَّلَامُ عَلَيْكُمْ ("Peace be upon you").  That would not be possible in sinographic transcription of northwest Sinitic speech.  This and other aspects and implications of alphabetic Dungan have been extensively discussed on LL.

After I brought Dungan speakers to America and wrote about them in Sino-Platonic Papers (no. 18, May 1990) and elsewhere four decades ago, they caught the attention of Berkeley professor William S-Y. Wang, to the extent that he organized a research trip to Kazakhstan / Kyrgyzstan where the Dungans live.  He was hoping to have one of his graduate students write her Ph.D. dissertation on Dungan.  Unfortunately, he had to give up on that plan because he said that neither he nor his graduate student could understand Dungan speech.

 

Selected readings

[Thanks to IA]

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-14 01:40 am

Conversation with a Chinese restaurateur in a west central Mississippi town

Posted by Victor Mair

Running down the road in Clarksdale, Mississippi, I screeched to a halt (felt like Rroad Runner) when I passed by a Chinese restaurant with the odd name Rice Bowl (in Chinese it was Fànwǎn lóu 饭碗楼 — the only characters I saw on the premises).  It was a tiny, nondescript establishment, with six or so chairs against the walls where you sat while you waited for your order to be prepared.  Most people, however, stood in line or just came in to pick up what they had ordered over the phone.

The owner did a brisk business, but it was strictly take out.  There were about 8 spaces for cars to park outside, though they were constantly coming and going.

The clientele was 100% Black Americans.  About half of them ordered egg rolls ($1.75 each), a quarter fried rice, and the remainder a predictable mix of standard American Chinese dishes (e.g., General Tso's Chicken, Moo Goo Gai Pan, etc.).  I wasted not one second on further scrutinizing the menu as soon as I spotted the Egg Foo Young.  There were several reasons for my hasty choice.  First of all, I hadn't tasted it for a long, long time.  Secondly, Egg Foo Young was my first exposure to "serious" Chinese cuisine.  It wasn't La Choy and it wasn't Chun King, i.e., it didn't come out of a can:

The only exception was that once a year our Mom would alternate taking one of the seven siblings to the big city of Canton (population about eighty thousand) five miles to the west and would treat us to a Chinese restaurant meal.  I think the owners were the only Chinese in the city.  The two things that impressed me most were how dark and mysterious the room was in the unmarked, old house where the restaurant was located, and how the egg foo young (and I just loved the sound of that name!), which was so much better than the canned chicken chow mein we ate at home, was served to us on a fancy, footed platter with a silver cover.  It was always a very special moment when the waiter uncovered the egg foo young and I smelled its extraordinary aroma.

(source)

After about 10-15 minutes, the Rice Bowl owner called out, "Egg Foo Young".  I walked up to the counter and said a few words in Mandarin to the owner as I picked up my order.  She was amazed.  "You speak Chinese?", she asked in English.  "Yes," I replied. "Nǐ huì bù huì jiǎng pǔtōnghuà? 你会不会讲普通话?"  "Not really," she answered in English.  "I speak Cantonese."  So I said a few words to her in Cantonese.  She was stunned, but after she had collected her senses, she asked, "Have you been to China?"  "Yes, a hundred times."  

That left the owner speechless.  So I repeated it in Mandarin and Cantonese.

Her eyeballs were glued to the back of their sockets and she seemed no longer able to breathe.

The owner had lots of other customers to take care of, so I thought it was time for me to leave.

"Zàijiàn / baai1baai3", I bid adieu.

 

P.S.:  The owner's actions were not unexpected.  In the many years she had been running that bustling, little take-out joint in Clarksdale, Mississippi, I doubt that she had ever seen a white man come in, certainly not one who spoke to her in Mandarin and Cantonese.

 

Selected readings

"General Tso's chikin" (6/11/13)

"General Chicken" (8/8/15)

"Chinese Philadelphia Food" (5/6/04)

"Chow mein from a can ≠ chǎomiàn / caau2min6 from a wok" (8/21/17)

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-13 03:53 pm

Persian language in the Indian subcontinent

Posted by Victor Mair

That's the title of a valuable Wikipedia article.  I have no idea who wrote it, but I'm very glad to have access to this comprehensive article, since it touches on so many topics that concern my ongoing research.

Here are some highlights:

Before British colonisation, the Persian language was the lingua franca of the Indian subcontinent and a widely used official language in the northern India. The language was brought into South Asia by various Turkics and Afghans and was preserved and patronized by local Indian dynasties from the 11th century, such as Ghaznavids, Sayyid dynasty, Tughlaq dynasty, Khilji dynasty, Mughal dynasty, Gujarat sultanate, and Bengal sultanate. Initially it was used by Muslim dynasties of India but later started being used by non-Muslim empires too. For example, the Sikh Empire, Persian held official status in the court and the administration within these empires. It largely replaced Sanskrit as the language of politics, literature, education, and social status in the subcontinent.

The spread of Persian closely followed the political and religious growth of Islam in the Indian subcontinent. However Persian historically played the role of an overarching, often non-sectarian language connecting the diverse people of the region. It also helped construct a Persian identity, incorporating the Indian subcontinent into the transnational world of Greater Iran, or Ajam. Persian's historical role and functions in the subcontinent have caused the language to be compared to English in the modern-day region.

Persian began to decline with the gradual deterioration of the Mughal Empire. Urdu and English replaced Persian as British authority grew in the Indian subcontinent. Persian lost its official status in the East India Company in 1837, and fell out of currency in the subsequent British Raj.

Persian's linguistic legacy in the region is apparent through its impact on the Indo-Aryan languages. It played a formative role in the emergence of Hindustani, and had a relatively strong influence on Punjabi, Sindhi, Bengali, Gujarati, and Kashmiri. Other languages like Marathi, Rajasthani, and Odia also have a considerable amount of loan words from Persian.

Literature

A large corpus of Persian literature was produced by inhabitants of the Indian subcontinent. Prior to the 19th century, the region produced more Persian literature than Iran. This consisted of several types of works: poetry (such as rubaʿi, qasidah), panegyrics (often in praise of patron kings), epics, histories, biographies, and scientific treatises. These were written by members of all faiths, not just Muslims. Persian also was used for religious expression in the subcontinent, the most prominent example of which is Sufi literature.

This extended presence and interaction with native elements led to the Persian prose and poetry of the region developing a distinct, Indian touch, referred to as sabk-e-Hindi (Indian style) among other names. It was characterised by an ornate, flowery poetic style, and the presence of Indian vocabulary, phrases, and themes. For example, the monsoon season was romanticised in Indo-Persian poetry, something that had no parallel in the native Irani style. Due to these differences, Iranian poets considered the style "alien" and often expressed a derisive attitude towards sabk-e-Hindi. Notable practitioners of sabk-e-Hindi were Urfi Shirazi, Faizi, Sa'ib, and Bedil.[54][53]

Translations from other literary languages greatly contributed to the Indo-Persian literary corpus. Arabic works made their way into Persian (e.g. Chach Nama). Turkic, the older language of Islamic nobility, also saw translations (such as that of Chagatai Turkic "Baburnama" into Persian). A vast number of Sanskrit works were rendered into Persian, especially under Akbar, in order to transfer indigenous knowledge; these included religious texts such as the Mahabharata (Razmnama), Ramayana and the four Vedas, but also more technical works on topics like medicine and astronomy, such as Zij-e-Mohammed-Shahi. This provided Hindus access to ancient texts that previously only Sanskritised, higher castes could read.

Influence on subcontinental languages

As a prestige language and lingua franca over a period of 800 years in the Indian subcontinent, Classical Persian exerted a vast influence over numerous Indic languages, which includes non-Indo-Aryan languages. Generally speaking, the degree of impact is seen to increase the more one moves towards the north-west of the subcontinent, i.e. the Indo-Iranian frontier. For example, the Indo-Aryan languages have the most impact from Persian; this ranges from a high appearance in Punjabi, Sindhi, Kashmiri, and Gujarati, to more moderate representation in Bengali and Marathi. The largest foreign element in the Indo-Aryan languages is Persian. Conversely, the Dravidian languages have seen a low level of influence from Persian. They still feature loans from the language, some of which are direct, and some through Deccani (the southern variety of Hindustani), due to the Islamic rulers of the Deccan.

Hindustani is a notable exception to this geographic trend. It is an Indo-Aryan lingua franca spoken widely across the Hindi Belt and Pakistan, best described as an amalgamation of a Khariboli linguistic base with Persian elements. It has two formal registers, the Persianised Urdu (which uses the Perso-Arabic alphabet) and the de-Persianised, Sanskritised Hindi (which uses Devanagari). Even in its vernacular form, Hindustani contains the most Persian influence of all the Indo-Aryan languages, and many Persian words are used commonly in speech by those identifying as "Hindi" and "Urdu" speakers alike. These words have been assimilated into the language to the extent they are not recognised as "foreign" influences. This is due to the fact that Hindustani's emergence was characterised by a Persianisation process, through patronage at Islamic courts over the centuries. Hindustani's Persian register Urdu in particular has an even greater degree of influence, going as far as to admit fully Persian phrases such as "makānāt barā-ē farōḵht" (houses for sale). It freely uses its historical Persian elements, and looks towards the language for neologisms. This is especially true in Pakistan (see #Contemporary).

The following Persian features are hence shared by many Indic languages but vary in the manner described above, with Hindustani and particularly its register Urdu bearing Persian's mark the most. It is also worth noting that due to the politicisation of language in the subcontinent, Persian features make an even stronger appearance among the Muslim speakers of the above languages.

There are separate sections on vocabulary (loanwords [with a long list of examples], indirect loans, and compounds), phonology, grammar, and writing systems.

This is not to complain, but missing from the references is this important volume edited by two of my colleagues at Penn:

Brian Spooner and William L. Hanaway, ed., Literacy in the Persianate World: Writing and the Social Order (Philadelphia:  University of Pennsylvania Press, 2012).

Table of Contents

The last chapter in the book, “Persian Scribes (munshi) and Chinese Literati (ru): The Power and Prestige of Fine Writing (adab/wenzhang)”, by VHM, of which the final paragraph reads:

Persian as a lingua franca spread not only through much of the Islamic world, but even as far as China during the thirteenth century, when Iran was loosely incorporated into the Mongol Empire. David Morgan shows how Persian became for a time the most important foreign language in China, where it was used in commercial exchanges with Muslim merchants profiting from the Pax Mongolica. But it was the Muslim realms in India that most fully adopted the Persian language and culture. The high point was reached in the sixteenth and seventeenth centuries, when the generous patronage offered by the wealthy Indian courts, and especially the Mughal court, attracted many poets from Iran. Muhammad Aslam Syed traces the decline of Persian in Muslim India and the rise of Urdu, a related vernacular language, to the second half of the eighteenth century. He associates it with the “humiliating” sack of Delhi by the Iranian ruler, Nadir Shah, in 1739, and the rise of a “new nobility” of poets who were merchants and shopkeepers and were uncomfortable with Persian as the language of the “old nobility”. The final blow to the status of Persian in India came in 1835 when the East India Company replaced it with English as the official language and in 1837 with Urdu as the language of the law courts. But for many, the loss of Persian was a cause for lament. Syed quotes the Indian poet Ghalib (1797-1869), who is regarded as the greatest Urdu poet, but who also composed poems in Persian: “If you want to see all the colours of life, read my Persian poetry, my Urdu diwan does not have all those colours. Persian is the mirror (of life) and Urdu is just like rust on that mirror (with which you start but when it is clean, it is Persian)”.

Spooner and Hanaway spent a couple of decades doing the research that resulted in this significant volume.  Their contribution is both lasting and substantial.

 

Selected readings

[Thanks to Sunny Jhutti]

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-13 12:55 pm

Plato's cave

Posted by Mark Liberman

The first two panels from SMBC a few days ago:

The rest of the strip:

The aftercomic:

The mouseover title: "Would you rather sit with friends watching shadows on the bigscreen or spending your time arguing with Plato about whether poetry should be legal?"

This expands on the 9/9/2015 SMBC:

Wikipedia explains the Allegory of the Cave, or you can read the original here.

Wikipedia's explanation of platonic reincarnation is here — and for its linguistic relevance, see here.

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-13 10:11 am

The linguistic pragmatics of LLMs

Posted by Victor Mair

"Does GPT-4 Surpass Human Performance in Linguistic Pragmatics?" Bojic, Ljubiša et al. Humanities and Social Sciences Communications 12, no. 1 (June 10, 2025). Ljubiša Bojić, Predrag Kovačević, & Milan Čabarkapa.  Humanities and Social Sciences Communications volume 12, Article number: 794 (2025)

 

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-12 11:44 pm

"More and more less confident"

Posted by Mark Liberman

From Adam Rasgon and Natan Odenheimer, "U.S. Embassy in Jerusalem Braces for Possible Israeli Strike on Iran" NYT 6/12/2025:

More recently, however, Mr. Trump has said he was less convinced that talks with Iran would yield a new nuclear deal.

“I’m getting more and more less confident about it,” he told The New York Post in a podcast broadcast on Wednesday.

Here's the podcast on YouTube. The quoted phrase is from around 35:08:

uh I'm f- I'm getting more and more less confident about it

"More and more less ADJ" is Out There — e.g. COCA has

Obama should stick to reality, but that appears more and more less likely from him.
Kids are becoming more and more less active and using their imagination less and less.

I haven't found (or thought up) any examples of "less and less more ADJ". And I think there's a semantico-pragmatic reason for the fact that "more and more less ADJ" is more plausible than "less and less more ADJ".  But working out the formal logic is giving me a headache, so I'll leave it to the commenters.

 

 

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-12 01:57 pm

Names as verbs

Posted by Mark Liberman

In a comment on yesterday's post "A 12th-century influencer", Laura Morland wrote:

Thanks for sharing "to abelard," the new verb of the month! Note to AP: the grammarians will insist that it be spelled with a lower-case "a". (Verbs are never capitalized, not even in German, I don't believe.)

This is one where The Errorist might have the upper hand.

The name most often verbed in English is probably MacGyver, and its verbal uses (almost?) always retain the capital letters. A few examples from the news:

[link] Don’t MacGyver a Solution to the R-454B Shortage
[link] This PopSocket Will Help You MacGyver Your Way Out of a Pickle
[link] 5 Badass Female TV Characters In STEM (And An Instance They Have MacGyvered)
[link] Macgyvered Neck Brace Saves Rare Peruvian Grasshopper
[link] The Pinkbike Podcast: Fox's Gearbox, 'MacGyvering' Ultra Premium Bikes & Counting Chains
[link] ‘MacGyvering’ Inventorship – It’s Much More than a TV Trope

Merriam-Webster agrees; so does Wiktionary, though they give a lower-case version as an "alternate spelling". The OED as well:

And the BBC even wrote about it — "How'MacGyver' became a verb".

I didn't yet turn up a "grammarian" opinion on this, but I did find a scholarly paper on the history of the verbification process: Aurélie Héois, “When Proper Names Become Verbs: A Semantic Perspective“, Lexis 2020.

,

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-12 10:46 am
Language Log ([syndicated profile] languagelog_feed) wrote2025-06-11 06:43 pm

A 12th-century influencer

Posted by Mark Liberman

From Ada Palmer, "Inventing the Renaissance: The Myth of a Golden Age":

The new scholastic method was so exciting! that when Peter Abelard got kicked out of his monastery (for proving its founding saint didn’t exist—that pissed off the abbot, who’d have guessed?) and went to live as a hermit in the wilderness of Champagne, 100,000 people flocked there to form a tent city and listen to him teach. Abelard’s crowd wasn’t bigger than Woodstock but it was twice the size of Paris at the time, ample to make France fear that crowd + superstar preacher => private army? Later, when Thomas Aquinas was up for sainthood, his advocates argued that every single chapter in his Summa Theologica should be considered an individual miracle, and the judges agreed. (It’s official folks, 3,000+ miracles in one compact paperback, only \$12.99! Unless you want to buy it in the period, in which case it’s \$650,000; you don’t get scholarship before the printing press unless wealthy elites believe it’s really, really worth the \$\$\$!)

Of course that's just a crumb from the loaf of Abelard's story, but Palmer's focus in that passage is on scholasticism — her chapter 23 on that topic is here.

If you enjoyed that fragment, you should read the whole thing. There are 66 other chapters, including chapter 59, "We Can't Just Abelard Harder Anymore", which starts like this:

In 1123, Peter Abelard attracted Woodstock-sized crowds using The Philosopher (Aristotle) to make two seemingly contradictory authorities agree. In the 1260s The Theologian Thomas Aquinas demonstrated this wedding of authorities even more potently, and all supporting Christianity. In 1345, the circle of scholar-friends of whom the Black Death spared few but Petrarch read Cicero’s so-Christian-seeming moral works, and spread the dream of tracking down more missing scraps of Lady Philosophy’s torn gown. It would all fit together, they were sure, as one saw from how much Plato, Aristotle, and (pseudo-) Dionysius agreed with Augustine. Cue book-hunting Poggio, Filelfo, and Aurispa trekking out. Cue retranslation, as master philologists like Lorenzo Valla, Pomponio Leto, and Poliziano give us a larger and stranger classical canon (we were so wrong about what Aristotle said!). Cue our charismatic genius Pico Abelarding harder than any man dared Abelard before, merging Kabbalah, Zoroaster, and the Koran, held together with Platonic baling wire. They must agree, they’re wise, and Plato and Aquinas both say all wise people are trying to paint wordportraits of the same truth glimpsed from slightly different angles.

Ada Palmer, the bright-eyed bushy-tailed grad student, found my Renaissance buddies Abelarding hard like this as I trekked through the archives seeking what they said about Lucretius. Some claimed Lucretius doesn’t really deny the immortal soul, offering other interpretations of those lines, or saying he wrote those sections while ill and temporarily insane, or that a wholesome Roman like Lucretius (friends with Cicero!) never believed such things but was repeating things Epicurus had said, who was farther from Christianity so more confused. But Lucretius is very hard to Abelard, it gets awkward, and half the commentaries resort to saying he was intermittently insane, to explain the times he gazed away from the True Subject of the universal portrait and started doodling these wacky atoms. As trips to Greece and back brought home more books, Lucretius was far from the only nail that refused to be hammered down.

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-11 10:36 am

Boop?

Posted by Mark Liberman

The latest xkcd:

Mouseover title: "With a good battery, the device can easily last for 5 or 10 years, although the walls probably won't."

The joke worked for me, although I was pretty sure that a (current) MacBook makes no sound when a usb device connects. I checked, and that's true.

A current Windows 11 laptop does make usb-connect sound — but "BOOP!" is not really a good onomatopoeia for it:

There are two distinct notes, roughly a descending fifth (at approximately 587 Hz and 392 Hz), so a monosyllabic "BOOP!" doesn't really work. And the onset of the first note is not really very stop-like, much less evoking a [b]:

It seems to me more like [ˈa.u] (= "AA-oo"), though I admit that's not nearly as evocative as "BOOP!". Commenters may have better ideas.

The only "device connect sound" I could find on the internet was this one:

It's more complicated, and even less BOOP!-like. The video's caption is "Windows 10/8 Device Connect Sound", which may be why it's familiar, though I can't place it more exactly than "that's a familiar sound".

Maybe Randall has a laptop with a different (and more boop-like) usb-connecting sound? Or maybe he was subconsciously inspired by the new musical?

The explanation and discussion at explainxkcd are not helpful with respect to this aspect of the joke.

Update — Commenters YRG and AKMA note that MacBooks make a rather boop-like sound when connected to charging power — here's what it sounds like on my laptop:

There are two partials, one at 880 Hz and the other at (concert A) 440 Hz, with amplitude contours that you can see in the spectrogram.

I wonder who invents such sounds, and who "owns" them (if anyone does)?

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-11 12:47 am

The grammar and sense of a poetic line

Posted by Victor Mair

Randy Alexander is not a professional Sinologist, but when it comes to reading Chinese poetry, he's as serious as one can be.  The following poem is by Du Fu (712-770), said by some to be "China's greatest poet".  In the presentation below, I will first give the text with its transcription, and then Randy's translation.  After that we will delve deeply into the grammatical exegesis of one line of the poem, the last.  

Dù Fǔ “Zèng Wèi Bā chǔshì"

—–

Rénshēng bù xiāng jiàn, dòng rú cān yù shāng. 

Jīnxī fù hé xī, gòng cǐ dēngzhú guāng. 

Shàozhuàng néng jǐshí, bìnfà gè yǐ cāng. 

Fǎng jiù bàn wéi guǐ, jīng hū rè zhòng cháng. 

Yān zhī èrshí zài, zhòng shàng jūnzǐ táng. 

Xī bié jūn wèi hūn, érnǚ hū chéngxíng. 

Yírán jìng fùzhí, wèn wǒ lái héfāng. 

Wèndá wèi jí yǐ, érnǚ luó jiǔjiāng. 

Yè yǔ jiǎn chūn jiǔ, xīn chuī jiān huáng liáng. 

Zhǔ chēng huìmiàn nán, yī jǔ lèi shí shāng. 

Shí shāng yì bù zuì, gǎn zǐ gùyì zhǎng. 

Míngrì gé shānyuè, shìshì liǎng mángmán

杜甫《赠卫八处士》

—–

人生不相见,动如参与商。
今夕复何夕,共此灯烛光。
少壮能几时,鬓发各已苍。
访旧半为鬼,惊呼热中肠。
焉知二十载,重上君子堂。
昔别君未婚,儿女忽成行。
怡然敬父执,问我来何方。
问答未及已,儿女罗酒浆。
夜雨剪春韭,新炊间黄粱。
主称会面难,一举累十觞。
十觞亦不醉,感子故意长。
明日隔山岳,世事两茫茫。

Presented to Wei Ba, an Unofficed Scholar

——-

In life we don't meet each other.
We move like The Belt of Orion and Antares.

Tonight again, is what kind of night?
Together here the lights glow.

The young and strong are able — for how long?
Their sideburns each will also turn grey.

When visiting old friends who half became ghosts,
I cry out in pangs of emotion.

Who knew that in twenty years,
You would be taking up a post at a lord's manor?

Long ago when we parted, you weren't yet married.
Now your children suddenly line up in front of me.

Happily they salute their father's friend,
And ask me where I came from.

The questioning and answering hadn't had time to finish
when the children laid out the wine and juice.

In the night rain you cut the spring chives;
in the fresh-cooked rice there is millet.

Our host speaks of our meeting's difficulty;
With one motion he lifts ten cups.

Ten cups and you aren't even drunk;
I'm moved that your old friendship is growing.

As the bright sun separates the mountains,
the world and its affairs both are far away.

Here are Randy's principles for translating from Literary Sinitic to English:

My general rules for translation are: 

1) don't add anything  (except things like implicit pronouns) or take anything away, 
2) as much as possible stick to the original word order even if it enters into syntactic poetic license in the English (of course within what's syntactically allowable in English poetry).


I don't read anyone else's translation first, but after I translate I will check some on the web to see if there are any major discrepancies. Here, the last line seems to be traditionally translated as something like "Tomorrow we will be separated by mountains, the world's affairs are unclear", but I see some problems with this (I don't think it's impossible, but there are some problems).

明日隔山岳,世事两茫茫。

First is the inanimate agency/cause of the passive gé 隔. I can't find anything in Hànyǔ dà cídiǎn 汉语大词典 (Unabridged dictionary of Sinitic) that has a similar structure for 隔. In a big grammar I have, Gǔ Hànyǔ yǔfǎ jí qí fāzhǎn 古汉语语法及其发展 (Ancient Chinese Grammar and Its Development) by Yáng Bójùn 杨伯峻 and Hé Lèshì 何乐士, I found a very small mention (p. 693 I think):

2.1.3 “(受事主语)·动·工具宾语”
宾语表示动作行为的工具。如:
(1)不夭斤斧。(庄子·逍遥游)30
“不夭(于)斤斧”。意谓不被斧子之类的器物所天折或天伤。

OK, so I guess it's "legal" to have a usage interpretation like that in this poem, but given its apparent rarity in Classical Chinese (I'm judging by the fact that I have only been able to find this one mention of this kind of usage and that it's quite old) I think it would be stretching it to say the mountains are separating them. The examples in HDC all seem to be X隔 (X separates (us/them), or 隔X (separates X).

Another problem is liǎng 两 ("two"). This is pretty clearly "two/both" which can only point to shì 世 ("world") and shì 事 ("affair[s]") as separate entities. If they are separate, then parallelism would strongly suggest (dictate?) that míng 明 and rì 日 ("day") are also separate. Despite the fact that 明日 ("bright day / sun") almost always means "tomorrow", if we can say míngyuè 明月 ("bright moon") then of course it's not ungrammatical at all to say 明日 ("bright sun / day"); perhaps Du Fu 杜甫 was mindfully using this as a kind of garden path sentence (similar to "The old man the boat."). This would shift the meaning to what I wrote above: "As the bright sun separates the mountains, the world and its affairs both are far away."

Du Fu 杜甫 lived through some difficulties and wrote (as I have so far seen) some dark stuff "Jiārén《佳人》("beautiful woman")、"Mèng Lǐ Bái《梦李白》("Dreaming of Li Bo"), but he also wrote "Wàng yuè 《望岳》("Gazing at the mountain"), which is not dark at all. Would it be inconceivable for "Zèng Wèi Bā chǔshì" 《赠卫八处士》 ("Presented to Wei Ba, an Unofficed Scholar") to have a happy-ish "screw-the-world" ending in the drunken spirit of Lǐ Bái 李白? He finally visits his friend after 20 years and they stay up until the sunrise, by that time forgetting the world and its affairs. He obviously dearly loved Lǐ Bái 李白 who we know knew how to use alcohol to forget the world and its affairs; wouldn't it be reasonable that he could do likewise? Also, if he had traveled so far after so long, wouldn't tomorrow be much too early for him to be already gone and on the other side of the mountains?

I eagerly await your response.

Randy several times asked for my critique of his interpretation of the last line, so I will give it, prefaced by my declaration that I think that context, content, sentiment, sense, drift, flow, and so forth outweigh strict grammatical rules, especially in poetry, and especially in the hands of a master like Du Fu.  Also, the English version should not be jarring, should make sense, and convey what the poet was aiming to express.  Here, in this last line, I think what Du Fu is trying to say is "Tomorrow we will be separated by mountains; both of us immersed in the boundless affairs of the world".

Back in the days of the towering Berkeley savant sinologues, Peter Alexis Boodberg (1903-1972) and Edward Hetzel Schafer (1913-1991), there were monumental disputes over whether Classical Chinese / Literary Sinitic (CC/LS) had grammar, and, if so, whether it were absolutely strict and inalterably fixed.  See, for example, Schafer's (in)famous "Supposed 'Inversions' in T'ang Poetry", Journal of the American Oriental Society, Vol. 96, No. 1 (Jan. – Mar., 1976), pp. 119-121 (3 pages).

Of course, CC/LS had/has grammar, elsewise an author wouldn't be able to write anything that makes sense (without grammar, writing would just be a jumble of words).  That is why I always felt comfortable and confident in teaching CC/LS to generations of students.  A good presentation of Chinese grammar may be found in the three-volume A First Course in Literary Chinese (Cornell, 1968) by Harold Shadick.  On the other hand, as you will see from my closing comment to this post, context trumps grammar.  To read poetry and make elegant sense of it, one has to understand what the poet is trying to say, and that takes learning / knowledge and intuition.  The only way to develop learning / knowledge and intuition is through vast amounts of reading / exposure to history and literature of all sorts.

Steve Owen translated the complete poems of Du Fu.  Here's how he handled the last couplet of the one under discussion:

Tomorrow we will be divided by mountains,

for both the world’s affairs are a vast blur.

Xiuyuan Mi comments:

is divide; simply introduces that the two would henceforth live different lives and rarely hear from the other personquite difficult to parse out the syntax though.

Notice that Xiuyuan recognizes the difficulty of parsing the syntax, but still strives to grasp the poet's underlying intent.

Here are Denis Mair's observations:

Owen is right, but I think there's an additional level of meaning, the poignant sense that both of them will not know what happens to each other. In this moment of strong feelings about friendship, the wish of each to know what will befall each other is very strong.

   I would translate the ge2 as "separated by"— "In days to come, separated by mountain peaks/"
   I would translate liang3 as "the two of us," but also implying "each other" — "events in store for us beyond either's reckoning" 
The beauty of 茫茫 is that it expresses unknowableness both in time (they both face an uncertain future) and in space (neither of them know what will befall the other, due to being apart).

Denis is a published poet, both in Chinese and in English, and has translated thousands of poems from and to Chinese.

Go with the flow, Randy.

 

Selected readings

Language Log ([syndicated profile] languagelog_feed) wrote2025-06-10 12:35 pm

AI schoolwork

Posted by Mark Liberman

Current LLMs can answer questions or follow instructions in a way that makes them useful as cheap and quick clerical assistants. Many students use them for doing homework, writing papers, and even taking exams — and many journalists, government functionaries, lawyers, scientists, etc., are using them in similar ways. The main drawback from users' point of view is that LLMs often make stuff up — this seems to have happened a couple of weeks ago to the crew who composed the MAHA report, and is an increasingly widespread problem in court documents. Attempts at AI-detectors have totally failed, and so the current academic trends are either in the direction of testing methods that isolate students from LLM-connected devices, or in the direction of syllabus structures that directly encourage students to use LLMs, but try to teach them to use them better.

Some of these attempts fall into the category of "prompt engineering" — this is certainly needed, but it's very much a moving target, and so I'm skeptical of its value. My colleague Chris Callison-Burch has devised some "AI-Enhanced learning" assignments that strike me as more likely to help students learn course content as well as LLM skills. I'm planning to spend the next month or so re-doing (aspects of) the syllabus for my undergrad Linguistics course in a similar spirit. One problem is that students in different schools at Penn currently have access to different software licenses, so some assignments might be free for some students but require non-trivial access fees for others.

In the news recently was OSU's total capitulation: "Ohio State launches bold AI Fluency initiative to redefine learning and innovation", 6/4/2025:

Initiative will embed AI into core undergraduate requirements and majors, ensuring all students graduate equipped to apply AI tools and applications in their fields

With artificial intelligence poised to reshape the future of learning and work, The Ohio State University announced today an ambitious new initiative to ensure that every student will graduate with the AI proficiencies necessary to compete and lead now.

Launching this fall for first-year students, Ohio State’s AI Fluency initiative will embed AI education into the core of every undergraduate curriculum, equipping students with the ability to not only use AI tools, but to understand, question and innovate with them — no matter their major.

I gather that this was a top-down decision, made without a lot of faculty consultation, and it'll be interesting to see how it works out. Needless to say, there's been a certain amount of academic pushback from around the world…

Meanwhile, we continue to see a trickle of stories about AI stumbles — for example Mark Tyson, "ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic", Tom's Hardware 6/9/2025 (and here's the LinkedIn post he's reporting on).

And there's a new term (and initialism) to cover such cases — AJI = "Artificial Jagged Intelligence". This turns out not to mean that AI systems can wound you if not handled carefully, though that's also true.

Lakshmi Varanasi, "AI leaders have a new term for the fact that their models are not always so intelligent", Business Insider 6/7/2025:

  • Google CEO Sundar Pichai says there's a new term for the current phase of AI: "AJI."
  • Pichai said it stands for "artificial jagged intelligence," and is the precursor to AGI.
  • AJI is marked by highs and lows, instances of impressive intelligence alongside a near lack of it.

Google CEO Sundar Pichai referred to this phase of AI as AJI, or "artificial jagged intelligence," on a recent episode of Lex Fridman's podcast.

"I don't know who used it first, maybe Karpathy did," Pichai said, referring to deep learning and computer vision specialist Andrej Karpathy, who cofounded OpenAI before leaving last year.

The cited podcast is here, FWIW.