<div dir="ltr">Dear all,<div><br></div><div>Coincidentally, this more academic paper on a similar concept was released just a couple of days ago: <a href="https://scholarship.law.unc.edu/nclr/vol103/iss2/4/">https://scholarship.law.unc.edu/nclr/vol103/iss2/4/</a> </div><div>A great subject for our WG.</div><div><br></div><div>Kind regards,</div><div>Paweł</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 27 Feb 2025 at 14:08, Paweł Kamocki <<a href="mailto:pawel.kamocki@gmail.com" target="_blank">pawel.kamocki@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Dear Philippe, dear all,<div><br></div><div>@Philippe: Thank you very much for sharing this, most inspiring!</div><div>@all: would you agree that these Knowledge Units (definition below) are in fact elaborate DTFs?:<br></div><div><p style="margin:0px;font-stretch:normal;font-size:10.1px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">Knowledge Unit (KU): A set of entities, at-</p><p style="margin:0px;font-stretch:normal;font-size:9.9px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">tributes, and relationships, capturing a short origi-</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">nal text excerpt.</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">Each Knowledge Unit captures:</p><p style="margin:0px;font-stretch:normal;font-size:10.1px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)"><span style="font-stretch:normal;font-size:10px;line-height:normal;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">•<span> </span></span>Entities: the core concepts or objects in the paragraph,</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">with relevant attributes.</p><p style="margin:0px;font-stretch:normal;font-size:10.1px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)"><span style="font-stretch:normal;font-size:10px;line-height:normal;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">•<span> </span></span>Relationships: statements that connect or link entities,</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">such as causal or definitional relationships.</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">•<span> </span><span style="font-stretch:normal;line-height:normal;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">Attributes: statements that describe entities according to</span></p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">the excerpt.</p><p style="margin:0px;font-stretch:normal;font-size:9.9px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)"><span style="font-stretch:normal;font-size:10px;line-height:normal;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">•<span> </span></span><span style="font-stretch:normal;font-size:9.9px;line-height:normal;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">Context summary: A few sentences summarizing the pre-</span></p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">vious knowledge units.</p><p style="margin:0px;font-stretch:normal;font-size:10.1px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)"><span style="font-stretch:normal;font-size:10px;line-height:normal;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal">•<span> </span></span>Sentence MinHash: A list of MinHashes of the source</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">sentences used to generate this KU.</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)"><br></p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">Kind regards,</p><p style="margin:0px;font-stretch:normal;font-size:10px;line-height:normal;font-family:Helvetica;font-size-adjust:none;font-kerning:auto;font-variant-alternates:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-east-asian:normal;font-feature-settings:normal;color:rgb(0,0,0)">Paweł</p></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 27 Feb 2025 at 09:41, Genêt, Philippe <<a href="mailto:P.Genet@dnb.de" target="_blank">P.Genet@dnb.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div>





<div lang="DE">
<div>
<p class="MsoNormal"><span style="font-size:10pt;font-family:Verdana,sans-serif">Dear all,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10pt;font-family:Verdana,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif">Today, a position paper (with participation of TIB, LAION, L3S, Uni TÜ etc.) has been published that envisages extracting “Knowledge Units” from in-copyright scholarly
 texts that can be used by LLMs in a legally sound way. I think it may be of some interest to you.
</span><span lang="EN-GB" style="font-size:10pt;font-family:Wingdings">J</span><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif">The paper can be downloaded here:
<a href="https://arxiv.org/pdf/2502.19413" target="_blank">https://arxiv.org/pdf/2502.19413</a> <u></u>
<u></u></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif">Cheers<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif">Philippe<u></u><u></u></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="font-size:10pt;font-family:Verdana,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10pt;font-family:Verdana,sans-serif">--<br>
</span><span style="font-size:9pt;font-family:Verdana,sans-serif">Philippe Genêt<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9pt;font-family:Verdana,sans-serif">Koordinator DNB@Text+<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9pt;font-family:"Times New Roman",serif"><br>
</span><span style="font-size:9pt;font-family:Verdana,sans-serif">Deutsche Nationalbibliothek</span><span style="font-size:9pt;font-family:"Times New Roman",serif">
<br>
</span><span style="font-size:9pt;font-family:Verdana,sans-serif">Fachbereich Informationsinfrastruktur<br>
Adickesallee 1</span><span style="font-size:9pt;font-family:"Times New Roman",serif">
<br>
</span><span style="font-size:9pt;font-family:Verdana,sans-serif">60322 Frankfurt am Main</span><span style="font-size:9pt;font-family:"Times New Roman",serif">
<br>
<br>
</span><span style="font-size:9pt;font-family:Verdana,sans-serif"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9pt;font-family:Verdana,sans-serif">Telefon: +49 69 1525-1847</span><span style="font-size:9pt;font-family:"Times New Roman",serif">
</span><u><span style="font-size:9pt;font-family:Verdana,sans-serif;color:blue"><u></u><u></u></span></u></p>
<p class="MsoNormal"><span style="font-size:9pt;font-family:Verdana,sans-serif">E-Mail:
<a href="mailto:p.genet@dnb.de" target="_blank">p.genet@dnb.de</a> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9pt;font-family:Verdana,sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:9pt;font-family:Verdana,sans-serif"><a href="http://www.text-plus.org/" target="_blank"><span lang="FR">text-plus.org</span></a></span><span lang="FR" style="font-size:9pt;font-family:Verdana,sans-serif"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:9pt;font-family:Verdana,sans-serif"><a href="http://www.dnb.de/" target="_blank"><span lang="FR">dnb.de</span></a></span><span style="font-size:9pt;font-family:Verdana,sans-serif">
</span><span lang="FR" style="font-size:9pt;font-family:"Times New Roman",serif"><br>
<br>
</span><span lang="FR"><u></u><u></u></span></p>
<p class="MsoNormal"><span lang="FR"><u></u> <u></u></span></p>
</div>
</div>

-- <br>
Tp-legal mailing list<br>
<a href="mailto:Tp-legal@lists.dnb.de" target="_blank">Tp-legal@lists.dnb.de</a><br>
<a href="https://lists.dnb.de/mailman/listinfo/tp-legal" rel="noreferrer" target="_blank">https://lists.dnb.de/mailman/listinfo/tp-legal</a><br>
</div></blockquote></div>
</blockquote></div>