Beyond Language: Inside a Hundred_Trillion_Token Video Model