|
| 1 | += Vectors |
| 2 | +:description: Create and store vectors (embeddings) as properties on nodes and relationships, and use them for efficient semantic retrieval with vector indexes and the GenAI plugin. |
| 3 | +:page-role: new-neo4j-2025.10 |
| 4 | + |
| 5 | +`VECTOR` values can be stored as xref:indexes/semantic-indexes/vector-indexes.adoc#embeddings[embedding] properties on nodes and relationships, and used for efficient semantic retrieval using xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes] and the xref:genai-integrations.adoc[GenAI plugin]. |
| 6 | + |
| 7 | +[IMPORTANT] |
| 8 | +Although the `VECTOR` type is present in Cypher 5, constructing and comparing vectors is only possible via link:https://neo4j.com/docs/cypher-manual/25/values-and-types/vector/[Cypher 25 features]. |
| 9 | +However, vectors can still be inserted in and retrieved from the database with Cypher 5 queries using a link:https://neo4j.com/docs/create-applications/[Neo4j client library] version >= 6.0. |
| 10 | + |
| 11 | + |
| 12 | +[[vector-type]] |
| 13 | +== The vector type |
| 14 | + |
| 15 | +The `VECTOR` type is a fixed-length, ordered collection of numeric values (`INTEGER` or `FLOAT`) stored as a single unit. |
| 16 | +The type of a value is defined by: |
| 17 | + |
| 18 | +- *Dimension* -- The number of values it contains. |
| 19 | +- *Coordinate type* -- The data type of the entries, determining precision and storage size. |
| 20 | + |
| 21 | +.An example `VECTOR` value |
| 22 | +[source] |
| 23 | +---- |
| 24 | +vector([1.05, 0.123, 5], 3, FLOAT32) |
| 25 | +---- |
| 26 | + |
| 27 | +In this example, `[1.05, 0.123, 5]` is the list of values, `3` its dimension, and `FLOAT32` the data type of the individual entries. + |
| 28 | +Each number in the list can also be seen as a coordinate along one of the vector's dimensions. |
| 29 | + |
| 30 | + |
| 31 | +[[valid-values]] |
| 32 | +=== Valid values |
| 33 | + |
| 34 | +- A `VECTOR` value must have a dimension and a coordinate type (see table below for supported coordinate types). |
| 35 | +- The dimension of a `VECTOR` value must be larger than `0` and less than or equal to `4096`. |
| 36 | +- Vectors cannot contain lists as elements. |
| 37 | + |
| 38 | +.Supported coordinate types |
| 39 | +[options="header",cols="2*<m"] |
| 40 | +|=== |
| 41 | +| Default name | Alias |
| 42 | + |
| 43 | +| `FLOAT` | `FLOAT64` |
| 44 | +| `FLOAT32` | |
| 45 | +| `INTEGER` | `INT`, `INT64`, `INTEGER64`, `SIGNED INTEGER` |
| 46 | +| `INTEGER8` | `INT8` |
| 47 | +| `INTEGER16`| `INT16` |
| 48 | +| `INTEGER32` | `INT8` |
| 49 | + |
| 50 | +|=== |
| 51 | + |
| 52 | + |
| 53 | +[[drivers-fallback]] |
| 54 | +== Vectors and client libraries (drivers) |
| 55 | + |
| 56 | +Working with vectors via link:{neo4j-docs-base-uri}/create-applications/[Neo4j's client libraries] results in a different behavior depending on the library version. |
| 57 | + |
| 58 | +- *Versions >= 6.0* -- Vectors are fully supported and mapped into client types (see the _Data types_ page of each language manual). |
| 59 | +- *Versions < 6.0* -- Returning a `VECTOR` already present in the database results in a placeholder `MAP` value and a warning. |
| 60 | ++ |
| 61 | +.Result of returning a `VECTOR` with a driver older than 6.0 |
| 62 | +[source] |
| 63 | +---- |
| 64 | ++----------------------------------------------------------------+ |
| 65 | +| n.vector | |
| 66 | ++----------------------------------------------------------------+ |
| 67 | +| {originalType: "VECTOR(1, INTEGER64)", reason: "UNKNOWN_TYPE"} | |
| 68 | ++----------------------------------------------------------------+ |
| 69 | +warn: One or more values returned could not be handled by this version of the driver and were replaced with placeholder map values. Please upgrade your driver! |
| 70 | +03N95 (Neo.ClientNotification.UnknownType) |
| 71 | +---- |
| 72 | + |
| 73 | + |
| 74 | +[[type-coercion]] |
| 75 | +== Type coercion |
| 76 | + |
| 77 | +_Coercion_ is the action of forcing entries of a different (implicit) type into a vector with a different coordinate type. |
| 78 | + |
| 79 | +When the coordinate type is the same as the type of the given elements, no coercion is done. |
| 80 | +When the coordinate type differs, coercion may be done or an error may be raised depending on the situation. |
| 81 | + |
| 82 | +*An error is raised* if a value does not fit into the coordinate type. |
| 83 | +If the coordinate type is an `INTEGER` type and all the coordinate values are `INTEGER` values, then an error will be raised if and only if one of the coordinate types does not fit into the size of the specified type. |
| 84 | +The same applies for `FLOAT` vector types: if the elements are all `FLOAT` values then an error will only be raised if one value does not fit into the specified type. |
| 85 | + |
| 86 | +*Coercion (i.e. lossy conversion) is allowed* when: |
| 87 | + |
| 88 | +- The list contains `INTEGER` values and the specified vector type is of a `FLOAT` type. |
| 89 | +Precision will be lost for values at the higher end of the range (see the link:https://docs.oracle.com/javase/specs/jls/se21/html/jls-5.html[Java type specification]), but an error will be raised only if the value were to overflow/underflow. + |
| 90 | +- The list contains `FLOAT` values and the specified type is of an `INTEGER` type. |
| 91 | +Information may be lost, as all values after the decimal point will be truncated, but an error will be raised only if the value were to overflow/underflow. + |
| 92 | + |
| 93 | + |
| 94 | +[[supertypes]] |
| 95 | +== Supertypes |
| 96 | + |
| 97 | +`VECTOR` is a supertype of `VECTOR<TYPE>(DIMENSION)` types. |
| 98 | +The same applies for `VECTOR` types with only a coordinate type or a dimension: |
| 99 | + |
| 100 | +- `VECTOR` with only a defined dimension is a supertype of all `VECTOR` values of that dimension, regardless of the coordinate type. |
| 101 | +For example, `VECTOR(4)` is a supertype of `VECTOR<FLOAT>(4)` and `VECTOR<INT8>(4)`. |
| 102 | +- `VECTOR` with only a defined coordinate type is a supertype of all `VECTOR` values with that coordinate type, regardless of the dimension. |
| 103 | +For example, `VECTOR<INT>` is a supertype of `VECTOR<INT>(3)` and `VECTOR<INT>(1024)`. |
| 104 | + |
| 105 | +All of these supertypes can be used in xref:expressions/predicates/type-predicate-expressions.adoc#type-predicate-vector[type predicate expressions]. |
| 106 | +For more information, see: |
| 107 | + |
| 108 | +* xref:values-and-types/ordering-equality-comparison.adoc#ordering-and-comparison[Equality, ordering, and comparison of value types -> Ordering vector types] |
| 109 | +* xref:values-and-types/property-structural-constructed.adoc#vector-type-normalization[Property, structural, and constructed values -> Vector type normalization] |
| 110 | + |
| 111 | + |
| 112 | +[[lists-embeddings-vector-indexes]] |
| 113 | +== Lists, vector embeddings, and vector indexes |
| 114 | + |
| 115 | +`VECTOR` and xref:values-and-types/lists.adoc[`LIST`] values are similar and can both be indexed and searched through using xref:indexes/semantic-indexes/vector-indexes.adoc[vector indexes], but have a few key differences: |
| 116 | + |
| 117 | +- Elements in a `LIST` can be accessed individually, whereas operations on a `VECTOR` must operate on the entire `VECTOR`: it is not possible to access or slice individual elements. |
| 118 | +- Storing vector embeddings as `VECTOR` properties with a defined coordinate type allows them to be stored more efficiently. |
| 119 | +Moreover, reducing a vector's coordinate type (e.g., from `INTEGER16` to `INTEGER8`) downsizes storage requirements and improves performance, provided all values remain within the range supported by the smaller type. |
| 120 | + |
| 121 | +For information about how to store embeddings as `VECTOR` values with the xref:genai-integrations.adoc[GenAI plugin], see: |
| 122 | + |
| 123 | +* xref:genai-integrations.adoc#single-embedding[Generate a single embedding and store it] |
| 124 | +* xref:genai-integrations.adoc#multiple-embeddings[Generate multiple embeddings and store them] |
0 commit comments