Skip to content

[Vertex AI] Add documentation for Imagen symbols #14411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,45 @@

import Foundation

/// An aspect ratio for images generated by Imagen.
///
/// To specify an aspect ratio for generated images, set ``ImagenGenerationConfig/aspectRatio`` in
/// your ``ImagenGenerationConfig``. See the [Cloud
/// documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#aspect-ratio)
/// for more details and examples of the supported aspect ratios.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenAspectRatio {
/// Square (1:1) aspect ratio.
///
/// Common uses for this aspect ratio include social media posts.
public static let square1x1 = ImagenAspectRatio(kind: .square1x1)

/// Portrait widescreen (9:16) aspect ratio.
///
/// This is the ``landscape16x9`` aspect ratio rotated 90 degrees. This a relatively new aspect
/// ratio that has been popularized by short form video apps (for example, YouTube shorts). Use
/// this for tall objects with strong vertical orientations such as buildings, trees, waterfalls,
/// or other similar objects.
public static let portrait9x16 = ImagenAspectRatio(kind: .portrait9x16)

/// Widescreen (16:9) aspect ratio.
///
/// This ratio has replaced ``landscape4x3`` as the most common aspect ratio for TVs, monitors,
/// and mobile phone screens (landscape). Use this aspect ratio when you want to capture more of
/// the background (for example, scenic landscapes).
public static let landscape16x9 = ImagenAspectRatio(kind: .landscape16x9)

/// Portrait full screen (3:4) aspect ratio.
///
/// This is the ``landscape4x3`` aspect ratio rotated 90 degrees. This lets to capture more of
/// the scene vertically compared to the ``square1x1`` aspect ratio.
public static let portrait3x4 = ImagenAspectRatio(kind: .portrait3x4)

/// Fullscreen (4:3) aspect ratio.
///
/// This aspect ratio is commonly used in media or film. It is also the dimensions of most old
/// (non-widescreen) TVs and medium format cameras. It captures more of the scene horizontally
/// (compared to ``square1x1``), making it a preferred aspect ratio for photography.
public static let landscape4x3 = ImagenAspectRatio(kind: .landscape4x3)

let rawValue: String
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,18 @@

import Foundation

/// An image generated by Imagen, stored in Cloud Storage (GCS) for Firebase.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenGCSImage {
/// The IANA standard MIME type of the image file; either `"image/png"` or `"image/jpeg"`.
///
/// > Note: To request a different format, set ``ImagenGenerationConfig/imageFormat`` in
/// your ``ImagenGenerationConfig``.
public let mimeType: String

/// The URI of the file in Cloud Storage (GCS) for Firebase.
///
/// This is a `"gs://"`-prefixed URI , for example, `"gs://bucket-name/path/sample_0.jpg"`.
public let gcsURI: String

init(mimeType: String, gcsURI: String) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,60 @@
// See the License for the specific language governing permissions and
// limitations under the License.

/// Configuration options for generating images with Imagen.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenGenerationConfig {
/// Specifies elements to exclude from the generated image.
///
/// Defaults to `nil`, which disables negative prompting. Use a comma-separated list to describe
/// unwanted elements or characteristics. See the [Cloud
/// documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images#negative-prompt)
/// for more details.
///
/// > Important: Support for negative prompts depends on the Imagen model.
public var negativePrompt: String?

/// The number of image samples to generate; defaults to 1 if not specified.
///
/// > Important: The number of sample images that may be generated in each request depends on the
/// model (typically up to 4); see the
/// [`sampleCount`](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/imagen-api#parameter_list)
/// documentation for more details.
public var numberOfImages: Int?

/// The aspect ratio of generated images.
///
/// Defaults to to square, 1:1. Supported aspect ratios depend on the model; see
/// ``ImagenAspectRatio`` for more details.
public var aspectRatio: ImagenAspectRatio?

/// The image format of generated images.
///
/// Defaults to PNG. See ``ImagenImageFormat`` for more details.
public var imageFormat: ImagenImageFormat?

/// Whether to add an invisible watermark to generated images.
///
/// If `true`, an invisible SynthID watermark is embedded in generated images to indicate that
/// they are AI generated; `false` disables watermarking.
///
/// > Important: The default value depends on the model; see the
/// [`addWatermark`](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/imagen-api#parameter_list)
/// documentation for model-specific details.
public var addWatermark: Bool?

/// Initializes configuration options for generating images with Imagen.
///
/// - Parameters:
/// - negativePrompt: Specifies elements to exclude from the generated image; disabled if not
/// specified. See ``negativePrompt``.
/// - numberOfImages: The number of image samples to generate; defaults to 1 if not specified.
/// See ``numberOfImages``.
/// - aspectRatio: The aspect ratio of generated images; defaults to to square, 1:1. See
/// ``aspectRatio``.
/// - imageFormat: The image format of generated images; defaults to PNG. See ``imageFormat``.
/// - addWatermark: Whether to add an invisible watermark to generated images; the default value
/// depends on the model. See ``addWatermark``.
public init(negativePrompt: String? = nil, numberOfImages: Int? = nil,
aspectRatio: ImagenAspectRatio? = nil, imageFormat: ImagenImageFormat? = nil,
addWatermark: Bool? = nil) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,29 @@

import Foundation

/// A response from a request to generate images with Imagen.
///
/// The type placeholder `T` is an image type of either ``ImagenInlineImage`` or ``ImagenGCSImage``.
///
/// This type is returned from:
/// - ``ImagenModel/generateImages(prompt:)`` where `T` is ``ImagenInlineImage``
/// - ``ImagenModel/generateImages(prompt:gcsURI:)`` where `T` is ``ImagenGCSImage``
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenGenerationResponse<T> {
/// The images generated by Imagen; see ``ImagenInlineImage`` and ``ImagenGCSImage``.
///
/// > Important: The number of images generated may be fewer than the number requested if one or
/// more were filtered out; see ``filteredReason``.
public let images: [T]

/// The reason, if any, that generated images were filtered out.
///
/// This property will only be populated if fewer images were generated than were requested,
/// otherwise it will be `nil`. Images may be filtered out due to the ``ImagenSafetyFilterLevel``,
/// the ``ImagenPersonFilterLevel``, or filtering included in the model. The filter levels may be
/// adjusted in your ``ImagenSafetySettings``. See the [Responsible AI and usage guidelines for
/// Imagen](https://cloud.google.com/vertex-ai/generative-ai/docs/image/responsible-ai-imagen)
/// for more details.
public let filteredReason: String?
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,36 @@

import Foundation

/// An image format for images generated by Imagen.
///
/// To specify an image format for generated images, set ``ImagenGenerationConfig/imageFormat`` in
/// your ``ImagenGenerationConfig``. See the [Cloud
/// documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/imagen-api#output-options)
/// for more details.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenImageFormat {
let mimeType: String
let compressionQuality: Int?

/// PNG image format.
///
/// Portable Network Graphic (PNG) is a lossless image format, meaning no image data is lost
/// during compression. Images in PNG format are *typically* larger than JPEG images, though this
/// depends on the image content and JPEG compression quality.
public static func png() -> ImagenImageFormat {
return ImagenImageFormat(mimeType: "image/png", compressionQuality: nil)
}

/// JPEG image format.
///
/// Joint Photographic Experts Group (JPEG) is a lossy compression format, meaning some image data
/// is discarded during compression. Images in JPEG format are *typically* larger than PNG images,
/// though this depends on the image content and JPEG compression quality.
///
/// - Parameters:
/// - compressionQuality: The JPEG quality setting from 0 to 100, where `0` is highest level of
/// compression (lowest image quality, smallest file size) and `100` is the lowest level of
/// compression (highest image quality, largest file size); defaults to `75`.
public static func jpeg(compressionQuality: Int? = nil) -> ImagenImageFormat {
return ImagenImageFormat(mimeType: "image/jpeg", compressionQuality: compressionQuality)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,16 @@

import Foundation

/// An image generated by Imagen, represented as inline data.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenInlineImage {
/// The IANA standard MIME type of the image file; either `"image/png"` or `"image/jpeg"`.
///
/// > Note: To request a different format, set ``ImagenGenerationConfig/imageFormat`` in
/// your ``ImagenGenerationConfig``.
public let mimeType: String

/// The image data in PNG or JPEG format.
public let data: Data

init(mimeType: String, bytesBase64Encoded: String) {
Expand Down
49 changes: 47 additions & 2 deletions FirebaseVertexAI/Sources/Types/Public/Imagen/ImagenModel.swift
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,15 @@ import FirebaseAppCheckInterop
import FirebaseAuthInterop
import Foundation

/// Represents a remote Imagen model with the ability to generate images using text prompts.
///
/// See the [Cloud
/// documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/image/generate-images) for
/// more details about the image generation capabilities offered by the Imagen model.
///
/// > Warning: For Vertex AI in Firebase, image generation using Imagen 3 models is in Public
/// Preview, which means that the feature is not subject to any SLA or deprecation policy and
/// could change in backwards-incompatible ways.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public final class ImagenModel {
/// The resource name of the model in the backend; has the format "models/model-name".
Expand Down Expand Up @@ -53,6 +62,20 @@ public final class ImagenModel {
self.requestOptions = requestOptions
}

/// **[Public Preview]** Generates images using the Imagen model and returns them as inline data.
///
/// The individual ``ImagenInlineImage/data`` is provided for each of the generated
/// ``ImagenGenerationResponse/images``.
///
/// > Note: By default, 1 image sample is generated; see ``ImagenGenerationConfig/numberOfImages``
/// to configure the number of images that are generated.
///
/// > Warning: For Vertex AI in Firebase, image generation using Imagen 3 models is in Public
/// Preview, which means that the feature is not subject to any SLA or deprecation policy and
/// could change in backwards-incompatible ways.
///
/// - Parameters:
/// - prompt: A text prompt describing the image(s) to generate.
public func generateImages(prompt: String) async throws
-> ImagenGenerationResponse<ImagenInlineImage> {
return try await generateImages(
Expand All @@ -65,12 +88,34 @@ public final class ImagenModel {
)
}

public func generateImages(prompt: String, gcsUri: String) async throws
/// **[Public Preview]** Generates images using the Imagen model and stores them in Cloud Storage
/// (GCS) for Firebase.
///
/// The generated images are stored in a subdirectory of the requested `gcsURI`, named as a random
/// numeric hash. For example, for the `gcsURI` `"gs://bucket-name/path/"`, the generated images
/// are stored in `"gs://bucket-name/path/1234567890123/"` with the names `sample_0.png`,
/// `sample_1.png`, `sample_2.png`, ..., `sample_N.png`. In this example, `1234567890123` is the
/// hash value and `N` is the number of images that were generated, up to the number requested in
/// ``ImagenGenerationConfig/numberOfImages``. The individual ``ImagenGCSImage/gcsURI`` is
/// provided for each of the generated ``ImagenGenerationResponse/images``.
///
/// > Note: By default, 1 image sample is generated; see ``ImagenGenerationConfig/numberOfImages``
/// to configure the number of images that are generated.
///
/// > Warning: For Vertex AI in Firebase, image generation using Imagen 3 models is in Public
/// Preview, which means that the feature is not subject to any SLA or deprecation policy and
/// could change in backwards-incompatible ways.
///
/// - Parameters:
/// - prompt: A text prompt describing the image(s) to generate.
/// - gcsURI: The Cloud Storage (GCS) for Firebase URI where the generated images are stored.
/// This is a `"gs://"`-prefixed URI , for example, `"gs://bucket-name/path/"`.
public func generateImages(prompt: String, gcsURI: String) async throws
-> ImagenGenerationResponse<ImagenGCSImage> {
return try await generateImages(
prompt: prompt,
parameters: ImagenModel.imageGenerationParameters(
storageURI: gcsUri,
storageURI: gcsURI,
generationConfig: generationConfig,
safetySettings: safetySettings
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
// See the License for the specific language governing permissions and
// limitations under the License.

/// A filter level controlling whether generation of images containing people or faces is allowed.
///
/// See the
/// [`personGeneration`](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/imagen-api#parameter_list)
/// documentation for more details.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenPersonFilterLevel: ProtoEnum {
enum Kind: String {
Expand All @@ -20,8 +25,23 @@ public struct ImagenPersonFilterLevel: ProtoEnum {
case allowAll = "allow_all"
}

/// Disallow generation of images containing people or faces; images of people are filtered out.
public static let blockAll = ImagenPersonFilterLevel(kind: .blockAll)

/// Allow generation of images containing adults only; images of children are filtered out.
///
/// > Important: Generation of images containing people or faces may require your use case to be
/// reviewed and approved by Cloud support; see the [Responsible AI and usage
/// guidelines](https://cloud.google.com/vertex-ai/generative-ai/docs/image/responsible-ai-imagen#person-face-gen)
/// for more details.
public static let allowAdult = ImagenPersonFilterLevel(kind: .allowAdult)

/// Allow generation of images containing people of all ages.
///
/// > Important: Generation of images containing people or faces may require your use case to be
/// reviewed and approved; see the [Responsible AI and usage
/// guidelines](https://cloud.google.com/vertex-ai/generative-ai/docs/image/responsible-ai-imagen#person-face-gen)
/// for more details.
public static let allowAll = ImagenPersonFilterLevel(kind: .allowAll)

let rawValue: String
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,16 @@
// See the License for the specific language governing permissions and
// limitations under the License.

/// A filter level controlling how aggressively to filter sensitive content.
///
/// Text prompts provided as inputs and images (generated or uploaded) through Imagen on Vertex AI
/// are assessed against a list of safety filters, which include 'harmful categories' (for example,
/// `violence`, `sexual`, `derogatory`, and `toxic`). This filter level controls how aggressively to
/// filter out potentially harmful content from responses. See the
/// [`safetySetting`](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/imagen-api#parameter_list)
/// documentation and the [Responsible AI and usage
/// guidelines](https://cloud.google.com/vertex-ai/generative-ai/docs/image/responsible-ai-imagen#safety-filters)
/// for more details.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenSafetyFilterLevel: ProtoEnum {
enum Kind: String {
Expand All @@ -21,9 +31,21 @@ public struct ImagenSafetyFilterLevel: ProtoEnum {
case blockNone = "block_none"
}

/// The most aggressive filtering level; most strict blocking.
public static let blockLowAndAbove = ImagenSafetyFilterLevel(kind: .blockLowAndAbove)

/// Blocks some problematic prompts and responses.
public static let blockMediumAndAbove = ImagenSafetyFilterLevel(kind: .blockMediumAndAbove)

/// Reduces the number of requests blocked due to safety filters.
///
/// > Important: This may increase objectionable content generated by Imagen.
public static let blockOnlyHigh = ImagenSafetyFilterLevel(kind: .blockOnlyHigh)

/// The least aggressive filtering level; blocks very few problematic prompts and responses.
///
/// > Important: Access to this feature is restricted and may require your use case to be reviewed
/// and approved by Cloud support.
public static let blockNone = ImagenSafetyFilterLevel(kind: .blockNone)

let rawValue: String
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,23 @@

import Foundation

/// Settings for controlling the aggressiveness of filtering out sensitive content.
///
/// See the [Responsible AI and usage
/// guidelines](https://cloud.google.com/vertex-ai/generative-ai/docs/image/responsible-ai-imagen#config-safety-filters)
/// for more details.
@available(iOS 15.0, macOS 12.0, macCatalyst 15.0, tvOS 15.0, watchOS 8.0, *)
public struct ImagenSafetySettings {
let safetyFilterLevel: ImagenSafetyFilterLevel?
let personFilterLevel: ImagenPersonFilterLevel?

/// Initializes safety settings for the Imagen model.
///
/// - Parameters:
/// - safetyFilterLevel: A filter level controlling how aggressively to filter out sensitive
/// content from generated images.
/// - personFilterLevel: A filter level controlling whether generation of images containing
/// people or faces is allowed.
public init(safetyFilterLevel: ImagenSafetyFilterLevel? = nil,
personFilterLevel: ImagenPersonFilterLevel? = nil) {
self.safetyFilterLevel = safetyFilterLevel
Expand Down
Loading
Loading