Every service that wants to match people to things faces the same structural problem: it must build its own model of what each person wants, from scratch, using only the data it can collect within its own walls. The result is a landscape of siloed, redundant, and opaque preference models that users cannot inspect, correct, or carry between services.
This paper introduces the fraglet: a small, structured unit that captures a single facet of characteristics, encoded as both human-readable text and a high-dimensional vector embedding. A fraglet is deliberately focused. It is one composable piece, not a comprehensive profile. Fraglets are not personal data stores. They are shared artefacts that can describe people, places, organisations, or things. Any number of people can associate with the same fraglet, adopt it as a starting point, or derive new fraglets from it. Hosted at fraglet.org and addressable by URI, fraglets give any service the ability to understand characteristics and match against them without building a behavioural model of its own. We describe the architecture, the association model, the matching mechanics, and a production deployment that validates the pattern.
The matching problem
A concert discovery service wants to recommend gigs to a new user. A property platform wants to surface neighbourhoods that fit a buyer's sensibility. An energy provider wants to match tariff structures to household consumption patterns. In each case, the service needs a structured understanding of the person, and it has none.
The conventional approach is to build that understanding internally: track behaviour, infer preferences, and gradually assemble a model. This works, but it creates three persistent problems.
First, cold start. Every new service begins blind. Until a user has generated enough behavioural data on that specific platform, recommendations are generic at best. The user's rich history elsewhere is inaccessible.
Second, opacity. The resulting models are opaque to the user. You cannot see what Spotify thinks you like, edit Netflix's model of your viewing taste, or correct a recruitment platform's assumptions about your working style. Preference is inferred, never stated.
Third, non-portability. Each model is locked inside the platform that built it. A decade of carefully calibrated food preferences on one service cannot be carried to another. The user rebuilds from scratch, repeatedly.
These are not technical failures of individual platforms. They are structural consequences of an architecture in which preference data is always platform-owned and platform-scoped. The fraglet proposes a different architecture: extract preference into a structured, portable, shareable unit that exists independently of any single service.
Related work
The past eighteen months have seen a convergence of regulatory pressure, open-protocol development, and commercial investment around the idea that personal context should not be locked inside platforms. Several serious initiatives are now in play, each addressing a genuine and valuable part of the problem.
Personal context protocols
The Human Context Protocol (HCP), developed by researchers at MIT, Stanford, Oxford, and Google DeepMind, proposes a user-centric architecture for preference governance. Users maintain a centralised vault of structured preferences, and AI services access relevant subsets through mediated interfaces with scoped permissions and revocation controls. HCP offers a strong model for consent and access scoping, explicitly designed to complement the Model Context Protocol (MCP).
The Personal Context Protocol (PCP) takes a more pragmatic approach: a persistent personal data node that stores durable facts, preferences, and AI-generated summaries. Applications connect via MCP-compatible configuration and request specific read permissions. PCP focuses on the practical problem of re-explaining context to every new AI tool.
The Open Context Layer (OCL), from Plurality Network, is a decentralised memory protocol that lets users carry preferences, goals, and behaviour across applications and agents. Built on encryption and multi-party compute, OCL stores context as vectorised embeddings with standardised schemas, accessible through a universal API.
Memory infrastructure
Mem0 provides an open-source memory layer between applications and LLMs. It automatically extracts, stores, and retrieves relevant information from conversations across three memory scopes: user, session, and agent. With over 41,000 GitHub stars and selection as AWS's exclusive memory provider for their Agent SDK, Mem0 represents significant traction in the conversational memory space.
Policy and regulation
The Data Transfer Initiative (DTI) has published principles for AI data portability, collaborated with Inflection AI on transferring conversational histories, and advises on DMA implementation. Regulatory tailwinds are building: the EU Digital Markets Act's one-year review is expected to classify conversational AI as a Virtual Assistant with targeted data portability obligations; the UK's Data Use and Access Act reached Royal Assent in June 2025; and over 1,100 AI-related bills were introduced in US states in 2025 alone.
The gap
These initiatives cluster around two related problems: conversational memory (how to stop re-explaining yourself to AI assistants) and preference governance (how to own and control your preference data across AI services). Both are valuable. But neither directly addresses a third problem:
How does a service that has never interacted with a person before understand their characteristics well enough to match them to relevant items, without building its own behavioural model from scratch?
Existing protocols are fundamentally personal data stores: one person, one vault, private by default. They answer the question "What does this person prefer?" Fraglets address a different question: "How close are this person's characteristics to this item?" And they do so with a model that is shareable, adoptable, and collaborative.
| Dimension | HCP / PCP / OCL / Mem0 | Fraglets |
|---|---|---|
| Primary function | Remember context across AI interactions | Enable matching against catalogues |
| Data model | Key-value preferences, conversation history | Structured text + vector embedding |
| Core operation | Retrieval: "What does this person prefer?" | Similarity: "How close to this item?" |
| Relationship model | 1:1 (one person, one data store) | Many-to-many (shared, adoptable, forkable) |
| Architecture | Centralised vault or personal node | Centralised API service (fraglet.org) |
| Matching | None inherent; preferences inform prompts | Built-in cosine similarity |
| Privacy approach | Scoped access to personal data | No personal data; only characteristic expression |
| Disclosure | PCP: three-tier progressive disclosure; others: binary access | Three-tier (title / brief / detail), also enabling composition |
| Group composition | Not addressed; strictly individual | Combine fraglets across people for group matching |
These initiatives are complementary, not competitive. HCP could govern which services access a user's fraglets. PCP and OCL could store references to fraglets alongside other context. Mem0 could learn from fraglet interactions. MCP could provide the transport layer. DTI's portability principles could inform fraglet server governance. Fraglets occupy a specific niche: the structured profile and matching layer.
The fraglet model
What a fraglet is
A fraglet is a structured, portable unit of characteristics within a given domain. It is composed of human-readable descriptive fields (a title, a brief summary, a detailed narrative, categorical tags, and domain-specific extensions) alongside a high-dimensional vector embedding derived from those fields. The dual representation matters: a fraglet is legible to people and computable by machines at the same time.
The term combines fragment with the diminutive -let. A fraglet is intentionally small: it captures a single facet, not a comprehensive profile. A jazz listener's late-night sensibility is one fraglet; their weekend brunch taste might be another. A neighbourhood's architectural character is one fraglet; its dining scene is a separate one.
Characteristics are not monolithic. A person's relationship with music is not one thing. A venue's identity is not one thing. By making the unit of exchange a single facet rather than a complete portrait, fraglets stay small enough to share and specific enough to match against. Multiple fraglets can be combined to build a richer picture when needed, without requiring any single fraglet to carry the full burden of description.
Fraglets are not limited to describing people. A venue, a neighbourhood, a product, a team, or a service can all be described by fraglets. The schema makes no assumption about what the subject is, only that its characteristics can be expressed in structured natural language and compared by semantic similarity.
Five properties define a fraglet:
- Portable. A fraglet follows a domain-agnostic schema. Any service implementing the pattern can accept and match against it, without integration with the originating platform.
- Dual-represented. Human-readable text and a vector embedding are always in sync. The text is interpretable; the embedding is computable. Neither is sufficient alone.
- Progressively detailed. The text fields form a hierarchy: a
title(a few words), abrief(a sentence or two), and adetail(a rich narrative). A consuming service chooses the resolution it needs. A listing page might show only titles, a detail view the brief, a matching algorithm the full detail and embedding. PCP implements a similar three-tier hierarchy for personal context. In fraglets, the hierarchy also serves a composability function: when combining fraglets from multiple people, the title level keeps the combination readable at a glance. - Personal, not personal. A fraglet expresses your taste but contains no personal data. No name, email, location, or behavioural history. A person might have several music fraglets for different moods and choose which to share in a given situation. The selection itself is part of the expression.
- Immutable. Once created, a fraglet does not change. Refinement creates a new fraglet linked to the original via a parent reference, preserving provenance.
Association, not ownership
The architectural decision that most distinguishes fraglets from existing personal context systems: users do not own fraglets. They associate with them.
A fraglet entity contains no user references. The association between a person and a fraglet is managed server-side as a separate concern. Multiple people can associate with the same fraglet. A fraglet exists independently of any single person's relationship to it.
A fundamentally different model from a personal data store:
- A music venue publishes a fraglet describing its programming sensibility. Hundreds of people associate with it, signalling affinity without any of them having authored it.
- A food critic publishes their palate as an open fraglet. Readers who share similar taste adopt it as a starting point, deriving their own variations.
- A community maintains a shared fraglet representing its collective character. New members associate with it; over time, derived versions emerge as the community evolves.
In each case, the fraglet is a shared artefact, not a personal record. Authored by one, meaningful to many. It can be created by a service, an organisation, or a community, not only by individuals. The association model enables collaborative, forkable structures around characteristics. No personal context vault is designed to do this.
Creation, adoption, and derivation
Creation
A fraglet begins with a signal: a playlist, a set of favourites, a list of liked items, or a free-text description of characteristics. The signal can come from an existing platform via API or from direct input.
A large language model analyses the input and produces a structured JSON object constrained to the fraglet schema: a descriptive title, a brief summary, a detailed narrative, categorical tags, and domain-specific extensions. The LLM's role is translating unstructured signals into the structured form the schema requires.
The descriptive text fields are then passed through an embedding model to produce a high-dimensional vector. This embedding captures the semantic essence of the fraglet, enabling similarity computation against item catalogues.
Finally, the user reviews the generated fraglet, modifies any fields that don't accurately reflect their characteristics, and saves it. The embedding is regenerated to reflect edits. LLM generation is optional. A conforming server can also accept fraglets where the user provides all text fields directly.
Adoption
When a fraglet is published with open visibility, anyone can adopt it: associating with it as-is, or using it as a starting point for their own variation.
Adoption is not copying in the conventional sense. When a person adopts a fraglet, they associate with the existing entity. If they want to modify it, they derive a new fraglet with a parent_id referencing the original. The original persists, unchanged, for everyone else who associates with it.
The result is a social layer around characteristics that doesn't require a social network. People discover fraglets that resonate with them, associate with them or build on them, and over time a distributed graph of shared profiles emerges.
Derivation chains
Because fraglets are immutable, every modification creates a new entity linked to its parent. Over time, this produces derivation chains: sequences of fraglets that show how a set of characteristics has evolved, branched, and been reinterpreted by different people.
A single fraglet may have multiple children, created by different people who each took the original in a different direction. The chain preserves provenance without constraining evolution. This is closer to a version history in a collaborative document than to a mutable profile that overwrites itself.
Visibility and architecture
Visibility
Every fraglet has one of three visibility levels, set by its creator:
- Private. Visible only to associated users. A service may default all fraglets to private, restricting them to internal use.
- Selective. Shared with specific services or individuals. The receiving party can read and match against the fraglet but cannot redistribute it. Consent is explicit and revocable.
- Open. Publicly discoverable. Anyone can read it, match against it, and adopt it. Open fraglets are the foundation of the collaborative model.
Open visibility combined with the association model creates a commons for structured profiles. A shared space where characteristics can be published, discovered, and built upon.
The fraglet.org service
Fraglets are hosted at fraglet.org: a centralised service that stores, manages, and serves fraglets over a standard API. Any developer can register, receive an API key, and start creating and consuming fraglets immediately. The service is designed around the same authentication pattern as Stripe or OpenAI: a prefixed key (frag_live_) used as a Bearer token.
A centralised service eliminates the discovery and trust problems that federated architectures introduce. Every fraglet lives at the same host, addressable at a stable URI. The service conforms to three design requirements:
- Consistent schema. Every fraglet conforms to the same structural schema. Domain-specific extensions use the
additionalfield, preserving core compatibility across domains. - Uniform security model. The three visibility levels (private, selective, open) govern access consistently. API key authentication ensures every request is attributable to a registered developer.
- Addressable fraglets. Every fraglet has a stable URI at
api.fraglet.org/api/v1/fraglets/{id}, fetchable with a single authenticated HTTP request.
Server capabilities are published at a well-known metadata endpoint (/.well-known/fraglets), declaring the embedding model, supported domains, and API base URL. The schema and protocol are open. Nothing prevents future federation if the ecosystem grows to warrant it, but a single canonical service is the right starting point for establishing the pattern and ensuring consistency.
Matching mechanics
The fraglet's embedding vector enables direct similarity computation against item catalogues. When a service embeds its catalogue items using the same model, cosine similarity between the fraglet vector and item vectors produces a ranked list of matches.
Embedding composition
The embedding is composed from three fields, each contributing differently to the semantic signal:
- Detail carries the most semantic weight: a rich narrative that captures nuance, context, and the qualitative character of the profile.
- Tags add dimensional breadth, anchoring the embedding in specific, searchable concepts.
- Additional provides domain-specific context: sub-profiles, structured attributes, and contextual metadata.
This composition means the embedding reflects both the texture of the narrative (what makes this profile distinctive) and the structure of the domain (what categories and attributes apply). The result is a vector that captures semantic similarity more effectively than keyword matching or category-based filtering.
Multi-fraglet matching
When a user associates with multiple fraglets within the same domain, their combined signal can be represented by averaging the embedding vectors. This produces a blended profile that reflects the breadth of their characteristics. Services can also match against individual fraglets separately, presenting results grouped by facet.
The progressive detail structure makes this practical for group matching, finding items that suit multiple people at once. A service can collect one fraglet from each person in a group, average the embeddings, and match the result against its catalogue. Because each fraglet is a focused unit, the combined signal remains coherent rather than dissolving into noise. And because the text fields are progressively detailed, the service can present the group with a readable summary of what was combined: a list of titles ("Nocturnal jazz wanderer", "Festival-stage rock fan", "Ambient explorer") is immediately legible in a way that a merged preference blob is not.
A group of friends finding a restaurant, a team choosing an offsite location, a household selecting an energy tariff. In each case, the unit nature of fraglets (small, focused, composable) makes combination tractable where a comprehensive profile would not be.
LLM re-ranking
Vector similarity provides efficient retrieval, but the top results can be further refined. An optional re-ranking step passes the fraglet's text fields and the candidate items' descriptions to an LLM, which produces an explained ranking. This adds interpretability: the user sees not just what matched but why.
Application domains
The fraglet pattern works across domains. Any context where structured characteristics need to be matched against a catalogue of options is a candidate. The schema provides a consistent structure that works wherever the mechanism applies.
Live music discovery (production)
The first production implementation powers MostMaker, a personalised live music discovery service for London. Users connect Spotify playlists or select artists; GPT-4o-mini produces a structured music fraglet; pgvector cosine similarity ranks hundreds of upcoming gigs against the fraglet embedding. The result is a personalised gig feed that reflects the user's actual taste rather than generic genre categories.
MostMaker demonstrates the full pipeline: signal acquisition, LLM-based generation, embedding, user refinement, and catalogue matching. It validates that the pattern works at production scale and produces recommendations users find genuinely useful.
Beyond taste
Music discovery is the most developed application, but the mechanism does not stop at taste. The dual representation — structured text plus vector embedding — works for any set of characteristics that can be expressed in natural language and compared by semantic similarity.
Property search (neighbourhood character, lifestyle fit), professional matching (working style, intellectual interests), travel planning (sensibility, pace, priorities), energy services (consumption patterns, household characteristics). In each case, the same architecture applies: a structured profile, an embedding, and cosine similarity against a catalogue.
The association model becomes particularly interesting in these broader domains. A neighbourhood could publish a fraglet describing its character; residents associate with it. A company could publish fraglets describing its team cultures; candidates discover them. The fraglet becomes a shared description that multiple parties relate to, not a private record that one person carries.
The domain field is open-ended. New domains require no changes to the schema or protocol, only conventions for the additional field, which communities can define and publish independently.
Open questions
The fraglet architecture is production-tested but not fully resolved. Several questions remain open, and addressing them honestly is part of the work.
Embedding model dependency
Fraglet embeddings are only comparable when generated by the same model. A fraglet embedded with text-embedding-3-large cannot be directly compared to an item embedded with a different model. The current specification requires servers to declare their embedding model at the metadata endpoint, and consumers to verify compatibility. Whether the protocol should specify a canonical model or define an alignment mechanism between models remains an open design question.
Cross-domain coherence
Fraglets are comparable only within the same domain. Whether meaningful cross-domain comparisons are possible — or desirable — is unclear. A music fraglet and a food fraglet occupy the same embedding space but describe fundamentally different things. The protocol currently treats domains as separate namespaces.
Schema evolution
The schema will need to evolve. Because fraglets are immutable, older fraglets remain valid even as the schema changes. But a versioning strategy is needed to ensure new servers can handle old fraglets and vice versa.
Abuse vectors
Could open fraglets be used for profiling or manipulation? The privacy-preserving design — no personal data, only characteristic expression — mitigates but does not eliminate this risk. A fraglet reveals nothing about who associates with it, but the characteristics it describes could be sensitive in certain contexts.
Accountability in the association model
If nobody owns a fraglet, who is responsible for harmful content? The current model assigns deletion rights to the creator, but open fraglets adopted by many users create a tension between individual authorship and collective use. Governance models for community-maintained fraglets are not yet defined.
Federation
The current architecture is centralised: fraglet.org is the canonical service. If the ecosystem grows to warrant multiple servers, the open schema and protocol make federation possible without redesigning the fraglet model itself. The questions of server trust, discovery, and cross-server identity would need to be addressed at that point, but the centralised starting point lets the pattern establish itself without premature architectural complexity.
Conclusion
Fraglets are infrastructure for a specific problem: matching structured profiles against catalogues of options, where the profile should be portable, shareable, and not locked to any single service.
The architecture rests on two choices. Dual representation: every fraglet is both human-readable and machine-computable, always in sync. People can inspect and correct them; machines can compute similarity against them directly. And association rather than ownership: fraglets are shared artefacts that anyone can relate to, adopt, and build upon. A commons for structured profiles, not a personal data vault.
Music taste was the first production application, but the mechanism generalises to any domain where characteristics can be expressed in natural language and compared by semantic similarity. The schema and protocol are open, and the fraglet.org API is accessible to any developer.
Fraglets complement existing personal context protocols. They fill a gap that conversational memory and preference governance do not address: making characteristics legible, portable, and matchable across services that share no platform, no model, and no memory layer.