Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions colabs/finetuning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@
{
"data": {
"text/plain": [
"[\u003c_Gemma2SpecialTokens.BOS: 2\u003e, 1596, 603, 671, 3287, 13060]"
"[<_Gemma2SpecialTokens.BOS: 2>, 1596, 603, 671, 3287, 13060]"
]
},
"execution_count": 22,
Expand Down Expand Up @@ -213,7 +213,7 @@
"id": "3ny2J07G2X7i"
},
"source": [
"We can decode an example from the batch to inspect the model input. We see that the `\u003cstart_of_turn\u003e` / `\u003cend_of_turn\u003e` where correctly added to follow Gemma dialog format."
"We can decode an example from the batch to inspect the model input. We see that the `<start_of_turn>` / `<end_of_turn>` where correctly added to follow Gemma dialog format."
]
},
{
Expand All @@ -238,9 +238,9 @@
"name": "stdout",
"output_type": "stream",
"text": [
"\u003cstart_of_turn\u003euser\n",
"Would love any other tips from anyone, but specially from someone who’s been where I’m at.\u003cend_of_turn\u003e\n",
"\u003cstart_of_turn\u003emodel\n",
"<start_of_turn>user\n",
"Would love any other tips from anyone, but specially from someone who’s been where I’m at.<end_of_turn>\n",
"<start_of_turn>model\n",
"J'apprécierais vraiment d'autres astuces, mais particulièrement par quelqu'un qui était était déjà là où je me trouve.\n"
]
}
Expand Down Expand Up @@ -398,7 +398,7 @@
"version_minor": 0
},
"text/plain": [
"train: 0%| | 0/301 [00:00\u003c?, ?it/s]"
"train: 0%| | 0/301 [00:00<?, ?it/s]"
]
},
"metadata": {},
Expand Down Expand Up @@ -641,7 +641,7 @@
"layout": "IPY_MODEL_e0685ee78f8c499390ac0e1024ab26a9",
"placeholder": "​",
"style": "IPY_MODEL_a022879a38e849cfac7178cea469f36f",
"value": " 301/301 [04:24\u0026lt;00:00, 6.86s/it]"
"value": " 301/301 [04:24&lt;00:00, 6.86s/it]"
}
},
"9cc2c80d152248c4bade0955c25554ad": {
Expand Down
18 changes: 9 additions & 9 deletions colabs/lora_finetuning.ipynb

Large diffs are not rendered by default.

48 changes: 24 additions & 24 deletions colabs/multimodal.ipynb

Large diffs are not rendered by default.

28 changes: 14 additions & 14 deletions colabs/quantization_aware_training.ipynb

Large diffs are not rendered by default.

16 changes: 8 additions & 8 deletions colabs/quantization_sampling.ipynb

Large diffs are not rendered by default.

38 changes: 19 additions & 19 deletions colabs/sampling.ipynb

Large diffs are not rendered by default.

20 changes: 10 additions & 10 deletions colabs/sharding.ipynb

Large diffs are not rendered by default.

56 changes: 28 additions & 28 deletions colabs/tokenizer.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -402,9 +402,9 @@
},
"cell_type": "markdown",
"source": [
"### `\u003cbos\u003e` / `\u003ceos\u003e`\n",
"### `<bos>` / `<eos>`\n",
"\n",
"In Gemma models, the begin of sentence token (`\u003cbos\u003e`) should appear only once at the begining of the input. You can add it either explicitly or with `add_bos=True`:"
"In Gemma models, the begin of sentence token (`<bos>`) should appear only once at the begining of the input. You can add it either explicitly or with `add_bos=True`:"
]
},
{
Expand Down Expand Up @@ -432,7 +432,7 @@
{
"data": {
"text/plain": [
"[\u003c_Gemma2SpecialTokens.BOS: 2\u003e, 4521, 2134, 235341]"
"[<_Gemma2SpecialTokens.BOS: 2>, 4521, 2134, 235341]"
]
},
"execution_count": 55,
Expand Down Expand Up @@ -465,7 +465,7 @@
{
"data": {
"text/plain": [
"[\u003c_Gemma2SpecialTokens.BOS: 2\u003e, 4521, 2134, 235341]"
"[<_Gemma2SpecialTokens.BOS: 2>, 4521, 2134, 235341]"
]
},
"execution_count": 59,
Expand All @@ -481,9 +481,9 @@
},
"cell_type": "markdown",
"source": [
"Similarly, the model can output a `\u003ceos\u003e` token to indicate the prediction is complete.\n",
"Similarly, the model can output a `<eos>` token to indicate the prediction is complete.\n",
"\n",
"When fine-tuning Gemma, you can train the model to predict `\u003ceos\u003e` tokens."
"When fine-tuning Gemma, you can train the model to predict `<eos>` tokens."
]
},
{
Expand All @@ -509,7 +509,7 @@
{
"data": {
"text/plain": [
"[4521, 2134, 235341, \u003c_Gemma2SpecialTokens.EOS: 1\u003e]"
"[4521, 2134, 235341, <_Gemma2SpecialTokens.EOS: 1>]"
]
},
"execution_count": 60,
Expand All @@ -525,11 +525,11 @@
},
"cell_type": "markdown",
"source": [
"### `\u003cstart_of_turn\u003e` / `\u003cend_of_turn\u003e`\n",
"### `<start_of_turn>` / `<end_of_turn>`\n",
"\n",
"When using the instruction-tuned version of Gemma, the `\u003cstart_of_turn\u003e` / `\u003cend_of_turn\u003e` tokens allow to specify who from the user or the model is talking.\n",
"When using the instruction-tuned version of Gemma, the `<start_of_turn>` / `<end_of_turn>` tokens allow to specify who from the user or the model is talking.\n",
"\n",
"The `\u003cstart_of_turn\u003e` should be followed by either:\n",
"The `<start_of_turn>` should be followed by either:\n",
"\n",
"* `user`\n",
"* `model`\n",
Expand All @@ -543,14 +543,14 @@
},
"cell_type": "code",
"source": [
"token_ids = tokenizer.encode(\"\"\"\u003cstart_of_turn\u003euser\n",
"Knock knock.\u003cend_of_turn\u003e\n",
"\u003cstart_of_turn\u003emodel\n",
"Who's there ?\u003cend_of_turn\u003e\n",
"\u003cstart_of_turn\u003euser\n",
"Gemma.\u003cend_of_turn\u003e\n",
"\u003cstart_of_turn\u003emodel\n",
"Gemma who?\u003cend_of_turn\u003e\"\"\")"
"token_ids = tokenizer.encode(\"\"\"<start_of_turn>user\n",
"Knock knock.<end_of_turn>\n",
"<start_of_turn>model\n",
"Who's there ?<end_of_turn>\n",
"<start_of_turn>user\n",
"Gemma.<end_of_turn>\n",
"<start_of_turn>model\n",
"Gemma who?<end_of_turn>\"\"\")"
],
"outputs": [],
"execution_count": null
Expand Down Expand Up @@ -584,7 +584,7 @@
"type": "string"
},
"text/plain": [
"'\u003cstart_of_turn\u003e'"
"'<start_of_turn>'"
]
},
"execution_count": 77,
Expand All @@ -600,11 +600,11 @@
},
"cell_type": "markdown",
"source": [
"### `\u003cstart_of_image\u003e`\n",
"### `<start_of_image>`\n",
"\n",
"In Gemma 3, to indicate the position of an image in the text, the prompt should contain the special `\u003cstart_of_image\u003e` token. Internally, Gemma model will automatically expand the token to insert the soft images tokens.\n",
"In Gemma 3, to indicate the position of an image in the text, the prompt should contain the special `<start_of_image>` token. Internally, Gemma model will automatically expand the token to insert the soft images tokens.\n",
"\n",
"(Note: There's also a `\u003cend_of_image\u003e` token, but is handled internally by the model)"
"(Note: There's also a `<end_of_image>` token, but is handled internally by the model)"
]
},
{
Expand All @@ -624,7 +624,7 @@
"source": [
"In all Gemma versions, a few tokens (`99`) are unused. This allow custom applications to define and fine-tune their own custom tokens for their application. Those tokens are available through `tokenizer.special_tokens.CUSTOM + xx`, with `xx` being a number between `0` and `98`\n",
"\n",
"\u003c!-- TODO(epot): Add option to customize the special tokens --\u003e"
"<!-- TODO(epot): Add option to customize the special tokens -->"
]
},
{
Expand Down Expand Up @@ -656,7 +656,7 @@
"type": "string"
},
"text/plain": [
"'\u003cunused17\u003e'"
"'<unused17>'"
]
},
"execution_count": 28,
Expand Down Expand Up @@ -694,12 +694,12 @@
"source": [
"tokenizer = gm.text.Gemma3Tokenizer(\n",
" custom_tokens={\n",
" 0: '\u003cmy_custom_tag\u003e',\n",
" 17: '\u003cmy_other_tag\u003e',\n",
" 0: '<my_custom_tag>',\n",
" 17: '<my_other_tag>',\n",
" },\n",
")\n",
"\n",
"tokenizer.encode('\u003cmy_other_tag\u003e')"
"tokenizer.encode('<my_other_tag>')"
],
"outputs": [
{
Expand Down Expand Up @@ -786,7 +786,7 @@
"type": "string"
},
"text/plain": [
"'\u003cmy_other_tag\u003e'"
"'<my_other_tag>'"
]
},
"execution_count": 36,
Expand Down
56 changes: 28 additions & 28 deletions colabs/tool_use.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@
"output_type": "stream",
"text": [
"INFO:2025-06-06 02:43:16,896:jax._src.xla_bridge:749: Unable to initialize backend 'pathways': Could not initialize backend 'pathways'\n",
"INFO:2025-06-06 02:43:16,897:jax._src.xla_bridge:749: Unable to initialize backend 'proxy': INVALID_ARGUMENT: IFRT proxy server address must be '\u003ctransport-type\u003e://\u003cbackend-address\u003e' (e.g., 'grpc://localhost'), but got \n",
"INFO:2025-06-06 02:43:16,897:jax._src.xla_bridge:749: Unable to initialize backend 'proxy': INVALID_ARGUMENT: IFRT proxy server address must be '<transport-type>://<backend-address>' (e.g., 'grpc://localhost'), but got \n",
"INFO:2025-06-06 02:43:16,900:jax._src.xla_bridge:749: Unable to initialize backend 'mlcr': Could not initialize backend 'mlcr'\n",
"INFO:2025-06-06 02:43:16,901:jax._src.xla_bridge:749: Unable to initialize backend 'sliceme': Could not initialize backend 'sliceme'\n"
]
Expand Down Expand Up @@ -194,10 +194,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -213,10 +213,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -233,10 +233,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -252,10 +252,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -272,10 +272,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -291,10 +291,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -311,10 +311,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -330,10 +330,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -350,10 +350,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -369,10 +369,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand Down Expand Up @@ -407,7 +407,7 @@
"\n",
"To create your own tool, you can inherit from the `gm.tools.Tool` class. You should provide:\n",
"\n",
"* A description \u0026 example, so the model knows how to use your tool\n",
"* A description & example, so the model knows how to use your tool\n",
"* Implement the `call` method. The `call` function can take arbitrary `**kwargs`, but the name of the args should match the ones defined in `tool_kwargs` and `tool_kwargs_doc`"
]
},
Expand Down Expand Up @@ -435,12 +435,12 @@
" query='Which day of the week are we today ?',\n",
" thought='The `datetime.strptime` uses %a for day of the week',\n",
" tool_kwargs={'format': '%a'},\n",
" tool_kwargs_doc={'format': '\u003cANY datetime.strptime expression\u003e'},\n",
" tool_kwargs_doc={'format': '<ANY datetime.strptime expression>'},\n",
" result='Sat',\n",
" answer='Today is Saturday.',\n",
" )\n",
"\n",
" def call(self, format: str) -\u003e str:\n",
" def call(self, format: str) -> str:\n",
" dt = datetime.datetime.now()\n",
" return dt.strftime(format)\n"
],
Expand Down Expand Up @@ -499,10 +499,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand All @@ -518,10 +518,10 @@
{
"data": {
"text/html": [
"\u003chr\u003e"
"<hr>"
],
"text/plain": [
"\u003cIPython.core.display.HTML object\u003e"
"<IPython.core.display.HTML object>"
]
},
"metadata": {},
Expand Down