??? AI? ?? ? ? ?? ??? ??? ???? ???????. ??? ??? ??? ??? ??? ?????? ? ??? ???? ?? ????? ???? ????? ?? ??? ??? ??? ? ????.
??? ??? ???? ??? ?? ??(LLM)? ????. LLM? ??? ??? ????? ?? ?? ????, ??? ??? ? ????? ?? ?? ??? ?? ??? ? ????. ???? ??, ??, ?? ??, ?? ?? ?? ? ?? ?? ?????.
?? ?? ?? ??? ?? ??? ??? ? ?? ??? ??? ??? AI ????? ????? ????, ?? ?? ??? ???? ?? ? ?? ???? ?? ????? ?? ??? ??? ? ????. ?? ?? AI ????? ??, ?? ? ???? ??? ????? ?????.
??? ??? ?? ?? ?? ??
???? ???? ???, ?? ?? ???? ?? ?(Zero-shot) ?? ?? ?(Few-shot) ??? ?? LLM? ?? ???? ??? ??? ????. ?? ?? ? ??? ??? ?? ??? ? ??? ??? ????. ??? ?? ? ??? ??? ?? ????? ?? ? ?????, ?? ?? ?? ?????.
GPT? ????? ??? ???, ????? ??? ???? ? ???? ???? ???? ??? ??? ?? ???? ? ??? ?? ? ????. ??, ?? ? ??? ????? ? ?? ??? ??? ? ?? ???.
? ??? ???? ?? ????? ????? ?? ???? ??? ???????. ???? ??? ??? ?? ? ???, ??? ?? ???? ??? ?????. ??? ?? ??? ?? ??? ????? ?? ??? ??? ???? ???? ? ?? ??, LLM ????? ???? ????.
? ?????? ?? ??? ??, ?????? ? ???? ?? ?? ?????? NVIDIA NeMo ?????? ???? LLM? ???????? ????? ?????.
NVIDIA NeMo? ??? ???? ??(Prompt learning)
NeMo? ???? ???? ??? ??? ??? ??? ?? ? ?? ???? ???? ?? ?? ??? ?????. ??? ??? ??? ????? ?? ??? ?? P-?? ??? ?????.
- ???? ???? ??? ???? ???? 2D ????? ??????. ? ???? ??? ??? 2D ??? ????? ????. ???(Task)? ?? ?? ?? ?? ????? ???? ????. ?? LLM ????? ???? ? ???? ??? ????? ?? ?? ???????. NeMo ???? ?? ??? ???? ??? ???? ??? ?? ???? ?? ???? ???.
- p-????? ?? ?? ???? ???? ?? LSTM ?? ?? “???? ???”? ?????. LSTM ????? p-??? ??? ? ???? ??????. ?? LLM ????? ???? ???, ? ???? ???? LSTM ???? ???????. LSTM ????? ??? p-???? ?? ?? ?? ?????, LSTM ??? ? ??? ?? ??? ?? ?? ???? ?????. NeMo p-?? ??? GPT? ????? ???? ???.
? ??? ??? ??? NeMo ???? ? ?? ?? ?? ?? ??, ? NeMo OSS ??? ?? NeMo ??? ?????.
??? GPT-3 345M ???? ??? ?? ???? ?? ????? GitHub? NeMo ????? ???? ? P-?? ????? ??? ???? ????. ? ??????? ??? ???? ? ???, ?? ????, ???? ?? ?? ??, ? ?? ??????? ?? ?? ? ???? ??? ????? ????? ?? ???.
?? ????? ?? ???? ????? ?? ??? ?????. ?? ?? ? ???? ???? ? ? NeMo ??? ?? ??? ??? ?????.
?? ??
NeMo Docker ????? ?? NeMo? ??? ? ????. ? ????? NeMo? ??? ? ?? ??????? ?? ??? ??? ?????. NeMo ????? ???? ? P-?? ????? NeMo 23.02 ????? ??????? ??? ????? ?? ???? ??? ? ? ????. ?? ????? ???? ? ????? ?????? ?????:
docker run -u $(id -u ${USER}):$(id -g ${USER}) --rm -it --net=host nvcr.io/nvidia/nemo:23.02 bash
?? ?? ???? ??? bash ?? ??? Jupyter Lab? ?????:
cd /workspace
jupyter lab --ip 0.0.0.0 --allow-root --port=8888
Jupyter lab? /workspace/nemo/tutorials/nlp/Multitask_Prompt_and_PTuning.ipynb?? ??? ??? ???? ??? NeMo ??? ?? ? ????.
?? ? ?? 5B ? 1.3B GPT-3 ??? ????? ??? GPU? ????, 4?? ?? ?? ??(TP)? ?? 20B ??? ????? 4?? NVIDIA Ampere ???? ?? NVIDIA Hopper ???? GPU? ?????.
??? ??
? ???? SQuAD ?? ?? ??? ?? ??? ?? ? ??? ??? ?????.
??? ??? JSON ?? ??? ???? .jsonl ????? ???. ? JSON ???? ??? ??? ???? ??? ??? ???? ?? ?? ??? ????? ???. ?? ???? ??? ????? ?? ??? ???? ??? ?? ?? ???? ???. ?? ?? 1? ?????.

???? ???
????? ??? ? ??? ???? ?? ???? ???. ? ??? ???? ?????? ?? ?? ??? ?? ?????. ??? ??? ?? ??? ????.
{
"taskname": "squad",
"prompt_template": "<|VIRTUAL_PROMPT_0|> Context: {context}\n\nQuestion: {question}\n\nAnswer:{answer}",
"total_virtual_tokens": 10,
"virtual_token_splits": [10],
"truncate_field": "context",
"answer_only_loss": True,
"answer_field": "answer",
}
?????? ??? 10?? ?? ??? ?? ????, ? ???? ????, ??, ??? ????? ??? ?????. ?? ??? JSON ??? ?? ??? ? ???? ???? ???? ??? ?? ??? ?????. NeMo? ?? ?? ?? ??? ???? ?? ?? ??? ???? ??? ?????(????? HuggingFace GPT-2 ??? ??? ???? Nemo ?? ??? ?? 2,048?? ??).
??
?? NeMo ???? ?? ??? yaml ??? ????, GitHub? NVIDIA/NeMo? ?? ??? ? ????. ???? ? yaml ??? ??? ?? 345M GPT ??? ?? ???? ??? ??????. NeMo p-??? ?? ?? ??? ??? ??? ? ????. NeMo? PyTorch Lightning ?????? ????? trainer.fit(model)
?? ???? ???? ???? ??? ??? ? ????.
??
?????, ??? ???? model.generate(inputs=test_examples)
?? ???? ??? ??? ?? ??("answer_field"
??)? ??? ??? ? ????.
??? ??? ?? ??? ??
????? ??? ???? 345M GPT-3 ?? ????? ?? 13B GPT-3 ? 5B GPT-3? ??? ?? NeMo GPT-3 ??? ??? ? ????. ? ??? ???? NVIDIA V100, NVIDIA A100, NVIDIA H100? ?? ??? ??? ??? ?? ?? GPU? ?????. ??? ????? ?, ?? ?? ?? ?? ??? ?????:
# Download the model from NGC gpt_file_name = "megatron_gpt_345m.nemo" !wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/nemo/megatron_gpt_345m/versions/1/files/megatron_gpt_345m.nemo -
NGC?? 345M GPT ??? ?????? ??, HuggingFace? ??? ?? 1.3B GPT-3 ?? 5B GPT-3 ??? ????? ?? gpt_file_name
??? .nemo ?? ??? ?????.
5B ??? ?? TP ??? 1? ??(nemo_gpt5B_fp16_tp1.nemo)? TP=2? ??(nemo_gpt5B_fp16_tp2.nemo, nemo_gpt5B_bf16_tp2.nemo) ? ??? ??? ?? ???????. ???? TP=1 ??? ??? ? ????. ?? ?? ?? ???? ?? ???? ??? ???? ?? ? ??? ??? ? ????.
?? GPU ???? ??
Jupyter ??? ??? ???? ?? ???? ?? ???? ?? GPU ????? ?????. ? ?? ??? TP(?: 20B GPT-3? ?? 4?, 5B GPT-3? ?? ?? ??? ?? 2?)? ???? ? ? ??? ?? GPU ??? ????? ?? NeMo ???? ?? ????? ???? ???. ? ????? ?? ????? ???? ?? ? ?? config
???? ?????.
??
? ????? ???? ?? ???? ??? ?????? ?? ??? ?? ??? ???? ?? GPU? ???? ??? ??? ???? ???? ??? ?????.
TP=2? 5B GPT ??(nemo_gpt5B_fp16_tp2.nemo) ?? TP=4? 20B GPT-3 ??? ????? ? ????. ??? ??? .nemo zip ????? ?????. ?? ?? ??? ?? ???? ?? ??? ??? ?? ? ?? ??? ??? NeMo ???? ?????. ?? ????? ?????:
tar -xvf nemo_gpt5B_fp16_tp2.nemo -C nemo_gpt5B_fp16_tp2.nemo.extracted
?? ?? ??? ? ???? nemo_gpt5B_fp16_tp2.nemo.extracted
? NeMo config
?? ?????.
??
?? ??? ??(??? ? ?? ?? ??????)? ??? ?? ??? ??? ????:
name: megatron_virtual_prompt_gpt
trainer:
devices: 2
accelerator: gpu
num_nodes: 1
precision: 16
logger: False # logger provided by exp_manager
enable_checkpointing: False
replace_sampler_ddp: False
max_epochs: 25 # min 25 recommended
max_steps: -1 # consumed_samples = global_step * micro_batch_size * data_parallel_size * accumulate_grad_batches
log_every_n_steps: 10 # frequency with which training steps are logged
val_check_interval: 1.0 # If is an int n > 1, will run val every n training steps, if a float 0.0 - 1.0 will run val every epoch fraction, e.g. 0.25 will run val every quarter epoch
gradient_clip_val: 1.0
resume_from_checkpoint: null # The path to a checkpoint file to continue the training, restores the whole state including the epoch, step, LR schedulers, apex, etc.
benchmark: False
exp_manager:
explicit_log_dir: null
exp_dir: null
name: ${name}
create_wandb_logger: False
wandb_logger_kwargs:
project: null
name: null
resume_if_exists: True
resume_ignore_no_checkpoint: True
create_checkpoint_callback: True
checkpoint_callback_params:
monitor: val_loss
save_top_k: 2
mode: min
save_nemo_on_train_end: False # Should be false, correct prompt learning model file is saved at model.nemo_path set below,
filename: 'megatron_gpt_prompt_tune--{val_loss:.3f}-{step}'
model_parallel_size: ${model.tensor_model_parallel_size}
save_best_model: True
model:
seed: 1234
nemo_path: ${name}.nemo # .nemo filename/absolute path to where the virtual prompt model parameters will be saved
virtual_prompt_style: 'p-tuning' # one of 'prompt-tuning', 'p-tuning', or 'inference'
tensor_model_parallel_size: 1 # intra-layer model parallelism
pipeline_model_parallel_size: 1 # inter-layer model parallelism
global_batch_size: 8
micro_batch_size: 4
restore_path: null # Path to an existing p-tuned/prompt tuned .nemo model you wish to add new tasks to or run inference with
language_model_path: ??? # Path to the GPT language model .nemo file, always required
save_nemo_on_validation_end: True # Saves an inference ready .nemo file every time a checkpoint is saved during training.
existing_tasks: [] # List of tasks the model has already been p-tuned/prompt-tuned for, needed when a restore path is given
new_tasks: ['squad'] # List of new tasknames to be prompt-tuned
## Sequence Parallelism
# Makes tensor parallelism more memory efficient for LLMs (20B+) by parallelizing layer norms and dropout sequentially
# See Reducing Activation Recomputation in Large Transformer Models: https://arxiv.org/abs/2205.05198 for more details.
sequence_parallel: False
## Activation Checkpoint
activations_checkpoint_granularity: null # 'selective' or 'full'
activations_checkpoint_method: null # 'uniform', 'block', not used with 'selective'
# 'uniform' divides the total number of transformer layers and checkpoints the input activation
# of each chunk at the specified granularity
# 'block' checkpoints the specified number of layers per pipeline stage at the specified granularity
activations_checkpoint_num_layers: null # not used with 'selective'
task_templates: # Add more/replace tasks as needed, these are just examples
- taskname: "squad"
prompt_template: "<|VIRTUAL_PROMPT_0|> Context: {context}\n\nQuestion: {question}\n\nAnswer:{answer}"
total_virtual_tokens: 10
virtual_token_splits: [10]
truncate_field: null
answer_only_loss: False
"answer_field": "answer"
prompt_tuning: # Prompt tunin specific params
new_prompt_init_methods: ['text'] # List of 'text' or 'random', should correspond to tasks listed in new tasks
new_prompt_init_text: ['some init text goes here'] # some init text if init method is text, or None if init method is random
p_tuning: # P-tuning specific params
encoder_type: "tpmlp" # ['tpmlp', 'lstm', 'biglstm', 'mlp']
dropout: 0.0
num_layers: 2 # number of layers for MLP or LSTM layers. Note, it has no effect for tpmlp currently as it always assumes it is two layers.
encoder_hidden: 2048 # encoder hidden for biglstm and tpmlp
init_std: 0.023 # init std for tpmlp layers
data:
train_ds: ???
validation_ds: ???
add_eos: True
shuffle: True
num_workers: 8
pin_memory: True
train_cache_data_path: null # the path to the train cache data
validation_cache_data_path: null # the path to the validation cache data
test_cache_data_path: null # the path to the test cache data
load_cache: False # whether to load from the cache data
optim:
name: fused_adam
lr: 1e-4
weight_decay: 0.01
betas:
- 0.9
- 0.98
sched:
name: CosineAnnealing
warmup_steps: 50
min_lr: 0.0 # min_lr must be 0.0 for prompt learning when pipeline parallel > 1
constant_steps: 0 # Constant steps should also be 0 when min_lr=0
monitor: val_loss
reduce_on_plateau: false
Jupyter Lab ?????? ???? ? ???? ??? ???? /workspace/nemo/exples/nlp/language_modeling/conf/megatron_gpt_prompt_learning_squad.yaml
? ?????.
config
???? ?? ??? ?? ?? ??? ???? ??????:
prompt_template: "<|VIRTUAL_PROMPT_0|> Context: {context}\n\nQuestion: {question}\n\nAnswer:{answer}"
total_virtual_tokens: 10
virtual_token_splits: [10]
truncate_field: null
answer_only_loss: False
"answer_field": "answer"
???? 10?? ?? ???? ??? ? ?? ?? ??? ??? ?? ?????.
????
??? ????? Jupyter ? ?????(?? → ?? ??? → ???)?? ??? ?? ???. ?? ?? bash ??? ?????:
python /workspace/nemo/examples/nlp/language_modeling/megatron_gpt_prompt_learning.py \
--config-name=megatron_gpt_prompt_learning_squad.yaml \
trainer.devices=2 \
trainer.num_nodes=1 \
trainer.max_epochs=25 \
trainer.precision=bf16 \
model.language_model_path=/workspace/nemo/tutorials/nlp/nemo-megatron-gpt-5B/nemo_gpt5B_fp16_tp2.nemo.extracted \
model.nemo_path=/workspace/nemo/examples/nlp/language_modeling/squad.nemo \
model.tensor_model_parallel_size=2 \
model.pipeline_model_parallel_size=1 \
model.global_batch_size=16 \
model.micro_batch_size=1 \
model.optim.lr=1e-4 \
model.data.train_ds=[/workspace/nemo/tutorials/nlp/data/SQuAD/squad_train.jsonl] \
model.data.validation_ds=[/workspace/nemo/tutorials/nlp/data/SQuAD/squad_val.jsonl]
?? ??? ?????:
model.tensor_model_parallel_size
? 5B GPT ??(nemo_gpt5B_fp16_tp2.nemo)? ?? 2?, 20B GPT-3 ??? ?? 4? ???? ???.trainer.devices
? TP ?? ??? ???? ???. 5B ??? ?? 4? ??, ?? ? ?? GPU? ???? ? ?? ??? ?? ??? ????.model.language_model_path
? ?? ?? ????? ?? ??? ???? ???.model.data.train_ds
,model.data.validation_ds
? ?? ? ?? ???? ??? ???? ???.
??
????? ??? ???? ?? ????? ???? NeMo?? ??? ?????:
python /workspace/nemo/examples/nlp/language_modeling/megatron_gpt_prompt_learning_eval.py \
virtual_prompt_model_file=/workspace/nemo/examples/nlp/language_modeling/intent_n_slot.nemo \
gpt_model_file=/workspace/nemo/tutorials/nlp/nemo-megatron-gpt-5B/nemo_gpt5B_fp16_tp2.nemo.extracted \
inference.greedy=True \
inference.add_BOS=False \
inference.tokens_to_generate=128 \
trainer.devices=2 \
trainer.num_nodes=1 \
tensor_model_parallel_size=2 \
pipeline_model_parallel_size=1 \
data_paths=["/workspace/nemo/tutorials/nlp/data/SQuAD/squad_test.jsonl"] \
pred_file_path="test-results.txt"
?? ??? ?????:
model.tensor_model_parallel_size
? 5B GPT ??(nemo_gpt5B_fp16_tp2.nemo)? ?? 2?, 20B GPT-3 ??? ?? 4? ???? ???.trainer.devices
? TP ?(?)? ??? ???? ???.pred_file_path
? ??? ??? ??? ???, ??? ??? ? ????.
NeMo? ???? ?? ?? ??? ?? ????
? ?????? NeMo? ???? ??? ?? ??? ???? ?? ?? ??? ?? LLM? ??? ???? ????? ??????. ?? ?? ??????? ??? ??? ???? ????? ??? ???? ????? ?? ??? NLP ??????? ??? ? ????.
LLM ??????? ????? GitHub? NVIDIA/NeMo? ?????. ?? ???? ????? ? ????.
?? ???
- GTC ??: An Introduction to Developing with Project Mellon (Spring 2023)
- GTC ??: Scaling Large Language Model Training with PAX on GPUs (Spring 2023)
- GTC ??: Deep Learning, LLM’s & Generative Models for Computer Games and Creative Industries (Spring 2023)
- SDK: NeMo Megatron
- SDK: NeMo LLM Service
- SDK: Nsight Deep Learning Designer
? ???? ??? SDK? ???? ?? ???, ?? ???, ?? ??, ??, ?? ??, ???? NVIDIA ??? ???? ??? ??? ??? ??? ? ????. ?? ??? ???? NVIDIA? ?? ????? ???? ? ??? ??? ??? ?????? ???? ??? ??? ???.