Ego4D + Ego-Exo4D | Embodied Data Atlas

对象拆分

先把数据集、方法、平台和模型层拆开，避免把不同对象混成一个标签。

Ego4D 第一视角视频库 覆盖人类日常活动、场景、物体交互和时间理解。

Ego-Exo4D 多视角技能数据 把 ego 视频和 exo 摄像机、pose、音频、专家解说同步。

Embodied upstream 上游数据层 适合感知和技能理解，但不是直接控制数据。

区别 不同于 DROID DROID 有机器人动作；Ego4D 系列主要是人类经验。

从采集到训练使用的路径，用来判断它距离 robot policy 有多近。

01 人类佩戴第一视角设备或处在多机位采集环境

02 记录活动视频、音频和环境上下文

03 Ego-Exo4D 同步外部摄像机和 3D pose

04 生成 narrations、annotations 和 benchmark splits

05 用于感知、技能理解、动作分割或 human-to-robot 研究

06 再由 EgoVerse 等项目尝试转成 robot-learning friendly episode

字段为阅读型归纳，具体 schema 以官方文档、loader 和 dataset card 为准。

示例切片

这一页应放在 human embodied perception 层：它比普通 web video 更贴近身体和任务，比 EgoVerse/Ropedia 更少机器人训练接口，比 DROID/OXE 距离控制策略更远。

层级	项目 / 结果	组织背景	公开规模	数据 / 方法形态	与当前项目关系
Generic video	Web video / VLM corpora	mixed sources	internet scale	unstructured video-text data	更大但身体视角和动作标注弱。
Egocentric human data	Ego4D + Ego-Exo4D	Meta / academic consortium	3,600+ h and 1,286.3 h tracks	ego video, exo cameras, pose, narrations	当前页核心对象。
Human-to-robot data	EgoVerse	Georgia Tech / collaborators	1,362 h demonstrations	human egocentric episodes for robot learning	更接近训练接口。
Robot trajectory data	DROID / OXE	robotics labs	tens of thousands to 1M+ trajectories	robot images, actions, states	直接用于 robot policy。

这里区分官方事实、结构性解释和对相邻项目的定位。

最强贡献 提供大规模人类身体经验和多视角技能理解数据。

不解决 缺少机器人执行动作，不能直接当作 imitation policy 数据。

战略意义 它是 human experience 到 robot policy 中间链路的重要上游。

用于快速决定这个项目在 atlas 中应该放在哪一层。

01 Ego4D 系列是人类具身感知数据，不是机器人轨迹。

02 Ego-Exo4D 的价值在同步 ego/exo 视角和 3D 动作上下文。

03 放入 atlas 时要明确它距离 robot action 还有一层转换。

优先官方页面、论文、代码、数据卡和下载文档。

Ego4D Docs https://ego4d-data.org/docs/

Ego4D Paper https://arxiv.org/abs/2110.07058

Ego-Exo4D Docs https://docs.ego-exo4d-data.org/overview/

Ego-Exo4D Paper https://arxiv.org/abs/2311.18259