ideas | Gongtianxiang Blog

Gongtianxiang Blog

1. Adversarial Attack ?

in CV

notion image

with gpt4

prompt changes Imperceptibly

llm evaluation result overturned.

notion image

notion image

💡

what microsoft has done?

PromptBench - a Hugging Face Space by March07

Discover amazing ML apps made by the community

https://huggingface.co/spaces/March07/PromptBench

PromptBench - a Hugging Face Space by March07

GitHub - microsoft/promptbench: A robustness evaluation framework for large language models on adversarial prompts

A robustness evaluation framework for large language models on adversarial prompts - GitHub - microsoft/promptbench: A robustness evaluation framework for large language models on adversarial prompts

GitHub - microsoft/promptbench: A robustness evaluation framework for large language models on adversarial prompts

https://github.com/microsoft/promptbench/tree/main

GitHub - microsoft/promptbench: A robustness evaluation framework for large language models on adversarial prompts

2. Evaluation dataset:

alpaca_farm

tatsu-lab/alpaca_farm · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

https://huggingface.co/datasets/tatsu-lab/alpaca_farm

tatsu-lab/alpaca_farm · Datasets at Hugging Face

3. give reference / reason

like PandaLM

given two different LLM response, how to make sure the judgement is reasonable?

just give your reason

notion image

Leaderboard

Alpaca Eval Leaderboard

Alpaca Eval Leaderboard

https://tatsu-lab.github.io/alpaca_eval/

Twikoo
Giscus
Utterance

GTX

GTX

摸鱼🐟，干饭🍚

统计

文章数:

20

最新文章

RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation

LLM for Science — An Attempt to Solve Partial Differential Equations

Woodpecker: Hallucination Correction for Multimodal Large Language Models

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

Determined 实用小技巧