[Tp-legal] Position Paper "Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs"
Paweł Kamocki
pawel.kamocki at gmail.com
Fr Feb 28 09:56:26 CET 2025
Dear all,
Coincidentally, this more academic paper on a similar concept was released
just a couple of days ago:
https://scholarship.law.unc.edu/nclr/vol103/iss2/4/
A great subject for our WG.
Kind regards,
Paweł
On Thu, 27 Feb 2025 at 14:08, Paweł Kamocki <pawel.kamocki at gmail.com> wrote:
> Dear Philippe, dear all,
>
> @Philippe: Thank you very much for sharing this, most inspiring!
> @all: would you agree that these Knowledge Units (definition below) are in
> fact elaborate DTFs?:
>
> Knowledge Unit (KU): A set of entities, at-
>
> tributes, and relationships, capturing a short origi-
>
> nal text excerpt.
>
> Each Knowledge Unit captures:
>
> • Entities: the core concepts or objects in the paragraph,
>
> with relevant attributes.
>
> • Relationships: statements that connect or link entities,
>
> such as causal or definitional relationships.
>
> • Attributes: statements that describe entities according to
>
> the excerpt.
>
> • Context summary: A few sentences summarizing the pre-
>
> vious knowledge units.
>
> • Sentence MinHash: A list of MinHashes of the source
>
> sentences used to generate this KU.
>
>
> Kind regards,
>
> Paweł
>
> On Thu, 27 Feb 2025 at 09:41, Genêt, Philippe <P.Genet at dnb.de> wrote:
>
>> Dear all,
>>
>>
>>
>> Today, a position paper (with participation of TIB, LAION, L3S, Uni TÜ
>> etc.) has been published that envisages extracting “Knowledge Units” from
>> in-copyright scholarly texts that can be used by LLMs in a legally sound
>> way. I think it may be of some interest to you. J
>>
>>
>>
>> The paper can be downloaded here: https://arxiv.org/pdf/2502.19413
>>
>>
>>
>> Cheers
>>
>> Philippe
>>
>>
>>
>> --
>> Philippe Genêt
>>
>> Koordinator DNB at Text+
>>
>>
>> Deutsche Nationalbibliothek
>> Fachbereich Informationsinfrastruktur
>> Adickesallee 1
>> 60322 Frankfurt am Main
>>
>> Telefon: +49 69 1525-1847
>>
>> E-Mail: p.genet at dnb.de
>>
>>
>>
>> text-plus.org <http://www.text-plus.org/>
>>
>> dnb.de <http://www.dnb.de/>
>>
>>
>> --
>> Tp-legal mailing list
>> Tp-legal at lists.dnb.de
>> https://lists.dnb.de/mailman/listinfo/tp-legal
>>
>
-------------- nächster Teil --------------
Ein Dateianhang mit HTML-Daten wurde abgetrennt...
URL: <http://lists.dnb.de/pipermail/tp-legal/attachments/20250228/999dd1c5/attachment-0001.htm>
Mehr Informationen über die Mailingliste Tp-legal