Poster
in
Workshop: Building Trust in LLMs and LLM Applications: From Guardrails to Explainability to Regulation

Evaluating Text Humanlikeness via Self-Similarity Exponent

Ilya Pershin

Project Page [ OpenReview]

Abstract

Evaluating text generation quality in large language models (LLMs) is critical for their deployment. We investigate the self-similarity exponent S, a fractal-based metric, as a metric for quantifying "humanlikeness." Using texts from the public available dataset and Qwen models (with/without instruction tuning), we find human-written texts exhibit S = 0.57, while non-instruct models show higher values, and instruct-tuned models approach human-like patterns. Larger models improve quality but benefit more with instruction tuning. Our findings suggest S as an effective metric for assessing LLM performance.

Chat is not available.