Skip to content
ℕ𝔸𝕊𝔸™ℕ𝕒𝕤𝕒𝕣𝕖™𝕊𝕡𝕒𝕔𝕖𝕏™ https://nasa.re/

Nasare™🚀

フレッシュ スペース & テクノロジー NEWS📢

  • About 𝒩𝒶𝓈𝒶𝓇𝑒™
  • SMART Tools
  • SMART AI
    • Kubeshark PCAP Export/Import
    • Coaching engineering managers to employ on organizational issues
    • Zelda: Hyperlink’s Awakening game engine documentation (2021)
    • Commercial Resupply Services-CRS
  • Show HN
    • WINd3x、iPod Bootrom エクスプロイトは 10 年遅かった
    • Just by Notでプログラミングするシステム オブジェクト指向プログラミングの活用
    • 主な BGP 増加をレジデンス Windows デスクトップに追加する
    • ガジェットの複雑さが増し、余分な IP の再利用が促進される
    • 新しい GitHub CLI 拡張インストゥルメント
    • Kubernetes を 7,500 ノードにスケーリング (2021 年)
    • Wander アプリ用の軽量なオンザフリット構成ライブラリ
    • Ask HN
      • Declare HN: I wrote a WebAssembly Interpreter and Toolkit in C
      • Describe HN: Kandria, an action RPG made in Frequent Voice, is now out
      • Demonstrate HN: ClickHouse-local – a runt instrument for serverless files analytics
      • Brand HN: Motion photographs Watchlist Chrome Extension
      • Existing HN: Connmap – Desktop widget that reveals your TCP company on an international diagram
      • Level to HN: An initiate source tool to generate Jet Engine compressors
      • Insist HN: Ov – characteristic smartly off terminal pager
      • Level to HN: Graphic-Walker – A special kind of originate-offer different to Tableau
      • Level to HN: A corpulent game of snake encoded in a url
      • Point out HN: Kweb: A a ways away interface to the earn browser’s DOM
      • Present HN: Nanelo DNS – Privacy-Kindly, European Nameservers
      • Reward HN: Vim on-line editor the exercise of WebAssembly, storing files the exercise of IndexedDB
      • Show HN: AREnets – TensorFlow-basically based mostly Relation Extraction equipment for work in Colab
      • Speak HN: What sub $200 product improved HN readers’ 2022
      • Uncover HN: Easy internet app for teenagers to management a single Philips Hue light
      • Advise HN: Daft Art – an album veil maker powered by AI and curated aesthetics
    • Show HN
      • Reward HN: Vim on-line editor the exercise of WebAssembly, storing files the exercise of IndexedDB
      • Android phones will at the moment obtain iPhone-love SOS satellite texting
      • Demonstrate HN: ClickHouse-local – a runt instrument for serverless files analytics
      • Show HN: AREnets – TensorFlow-basically based mostly Relation Extraction equipment for work in Colab
      • Present HN: Nanelo DNS – Privacy-Kindly, European Nameservers
      • Insist HN: Ov – characteristic smartly off terminal pager
      • Level to HN: An initiate source tool to generate Jet Engine compressors
      • Speak HN: What sub $200 product improved HN readers’ 2022
      • Advise HN: Daft Art – an album veil maker powered by AI and curated aesthetics
      • Uncover HN: Easy internet app for teenagers to management a single Philips Hue light
      • Level to HN: A corpulent game of snake encoded in a url
    • Brand HN: Motion photographs Watchlist Chrome Extension
    • Existing HN: Connmap – Desktop widget that reveals your TCP company on an international diagram
    • Point out HN: Kweb: A a ways away interface to the earn browser’s DOM
    • Declare HN: I wrote a WebAssembly Interpreter and Toolkit in C
    • Tag HN: Using Key-Value Retail outlets in Serverless Codehooks.io Applications
    • Declare HN: I wrote a WebAssembly Interpreter and Toolkit in C
    • Point out HN: Kweb: A a ways away interface to the earn browser’s DOM
    • Existing HN: Connmap – Desktop widget that reveals your TCP company on an international diagram
    • Brand HN: Motion photographs Watchlist Chrome Extension
    • Level to HN: A corpulent game of snake encoded in a url
    • Uncover HN: Easy internet app for teenagers to management a single Philips Hue light
    • Advise HN: Daft Art – an album veil maker powered by AI and curated aesthetics
    • Speak HN: What sub $200 product improved HN readers’ 2022
    • Level to HN: An initiate source tool to generate Jet Engine compressors
    • Level to HN: Graphic-Walker – A special kind of originate-offer different to Tableau
    • Insist HN: Ov – characteristic smartly off terminal pager
    • Present HN: Nanelo DNS – Privacy-Kindly, European Nameservers
    • Show HN: AREnets – TensorFlow-basically based mostly Relation Extraction equipment for work in Colab
    • Demonstrate HN: ClickHouse-local – a runt instrument for serverless files analytics
    • Reward HN: Vim on-line editor the exercise of WebAssembly, storing files the exercise of IndexedDB
  • A Computer virus and a Dilemma
    • OCIS – OwnCloud Countless Scale
    • A Princeton student built an app which is ready to detect if ChatGPT wrote an essay
    • GitHub Is Sued, and We Would possibly perhaps Learn Something About Creative Commons Licensing
    • Adobe’s Command material analysis can be using photos/videos to narrate AI w/o consent
    • Ultralearning a.k.a. how I learned to code
    • Automatic1111 is assist on GitHub after taking away Embedding Links
    • Where Your Sides Came From
    • What Is a Pig Butchering Rip-off?
    • Submit-processing is ruining iPhone photos
    • Clos Topologies and Centralized Retain a watch on in Google’s Datacenter Community
    • Miller Engineering DS-1 House Planetarium
    • Gimel Studio: Non-harmful, 2D image editor
  • Technology
    • Miller Engineering DS-1 House Planetarium
    • Apple Doctors: to construct it as a file it is advisable to electronic mail it to your self
    • The Air India passenger who urinated on a girl has been fired by Wells Fargo
    • The Filesystem Hierarchy Usual Comes to Guix Containers
    • Sooner than it sued Google for copying from Java, Oracle changed into as soon as copying IBM’s SQL (2020)
    • The i3-gaps mission has been merged with i3
    • Making an Alphorn from Scratch
    • Apple: Braille Individual Guides
    • MotherDuck Is a Original Thought
    • The class of CGI and simple make
    • Like a mercurial tour of DragonFly BSD 6.4?
    • computer science
    • AI
    • Artificial intelligence
    • Technology
    • Ai
    • Apple
    • digital
  • TOP HN
    • anti-Mastodon
    • TOP HN
      • Show HN
      • Ask HN
      • coding
      • ガジェットの複雑さが増し、余分な IP の再利用が促進される
      • 新しい GitHub CLI 拡張インストゥルメント
      • Kubernetes を 7,500 ノードにスケーリング (2021 年)
      • Just by Notでプログラミングするシステム オブジェクト指向プログラミングの活用
      • Describe HN: Kandria, an action RPG made in Frequent Voice, is now out
      • WINd3x、iPod Bootrom エクスプロイトは 10 年遅かった
      • Wander アプリ用の軽量なオンザフリット構成ライブラリ
      • Artificial intelligence
      • 主な BGP 増加をレジデンス Windows デスクトップに追加する
      • Technology
      • Level to HN: Graphic-Walker – A special kind of originate-offer different to Tableau
      • Declare HN: I wrote a WebAssembly Interpreter and Toolkit in C
      • Point out HN: Kweb: A a ways away interface to the earn browser’s DOM
      • Existing HN: Connmap – Desktop widget that reveals your TCP company on an international diagram
      • Advise HN: Daft Art – an album veil maker powered by AI and curated aesthetics
      • Brand HN: Motion photographs Watchlist Chrome Extension
      • Declare HN: I wrote a WebAssembly Interpreter and Toolkit in C
      • Demonstrate HN: ClickHouse-local – a runt instrument for serverless files analytics
      • Describe HN: Kandria, an action RPG made in Frequent Voice, is now out
      • Existing HN: Connmap – Desktop widget that reveals your TCP company on an international diagram
      • Insist HN: Ov – characteristic smartly off terminal pager
      • Just by Notでプログラミングするシステム オブジェクト指向プログラミングの活用
      • Kubernetes を 7,500 ノードにスケーリング (2021 年)
      • Level to HN: A corpulent game of snake encoded in a url
      • Level to HN: An initiate source tool to generate Jet Engine compressors
      • Level to HN: Graphic-Walker – A special kind of originate-offer different to Tableau
      • Point out HN: Kweb: A a ways away interface to the earn browser’s DOM
      • Present HN: Nanelo DNS – Privacy-Kindly, European Nameservers
      • Reward HN: Vim on-line editor the exercise of WebAssembly, storing files the exercise of IndexedDB
      • Show HN: AREnets – TensorFlow-basically based mostly Relation Extraction equipment for work in Colab
      • Speak HN: What sub $200 product improved HN readers’ 2022
      • Uncover HN: Easy internet app for teenagers to management a single Philips Hue light
      • Wander アプリ用の軽量なオンザフリット構成ライブラリ
      • WINd3x、iPod Bootrom エクスプロイトは 10 年遅かった
      • ガジェットの複雑さが増し、余分な IP の再利用が促進される
      • 主な BGP 増加をレジデンス Windows デスクトップに追加する
      • 新しい GitHub CLI 拡張インストゥルメント
      • ロンドンで*ダウン*を継続的に見つめる: Pavement Oddities
      • 報酬 HN: C の 30 行でスピンロック
      • FAA の NOTAM とは何ですか? 航空専門家が機械の仕組みを説明
      • ナノGPT
      • 1 ビット LCD のグレースケール (2022)
      • The Muse (YC W12) は FP&A のシニア ディレクターを採用しています
      • Zen (YC S21) はグロース エンジニアを採用しています
      • Tall Inquire of (YC W21) が B2B 回顧録の幹部を採用
      • Actiondesk (YC S19) は、プロダクト ドレスメーカーを採用しています (4-6 か月の契約)
      • Oven (YC S19) は、Bun を作成するために C/C++ または Zig エンジニアを採用しています。
      • WInd3x, the iPod Bootrom exploit 10 years too unhurried
      • Sign HN: FASTA recordsdata を操作するための FUSE モジュール
      • HN を指します: Socketify.py: PyPy3 および Python3 用の Http/Https および WebSocket サーバー
      • Wage Development Continues to Gradual in the UK and Euro House
      • Stage Supervisor for the unimpressed: 1 Getting started
      • First public free up of Pushup: a brand unique compiler for making net apps in Trip
      • Fixing Cart-Pole Swingup with a Hierarchical Controller
      • Flight Testing the Touchdown Radar for Mars Science Laboratory 2011-06-21T17:36:36Z
      • Flightcontrol (YC W22) is hiring first Developer Recommend
      • Flying boats and other tech for cleaner shipping
      • Four Finalist Touchdown Location Candidates for Mars Science Laboratory 2008-11-19T16:21:01Z
      • France’s prized nuclear sector stalled in Europe’s hour of want
      • French startup unveils new residential thermo-acoustic warmth pump
      • FTC Cracks Down on Firms That Impose Contaminated Noncompete Restrictions
      • FTX’s Aged Prime Lawyer Aided US Authorities in Bankman-Fried Case
      • FY18 NASA lėšų šnypštimas 2017-05-22T00:00:00Z
      • G-3PO: A protocol droid for Ghidra, or GPT-3 for reverse-engineering
      • Gail.com FAQ
      • Gemini-Titan (GT)-6 – Gemini 6 of 7 – 지역 사진 – 외부 지역 1965-12-15T00:00:00Z
      • Geoffrey Hinton Publishes Original Deep Learning Algorithm
      • Germany warns: AI fingers flee already underway (2021)
      • Gimel Studio: Non-harmful, 2D image editor
      • GitHub Availability File
      • GitHub Is Sued, and We Would possibly perhaps Learn Something About Creative Commons Licensing
      • Google needs RISC-V to be a “tier-1” Android architecture
      • Google researcher, lengthy out of math, cracks devilish dispute about gadgets
      • Google start sourced CDC File Transfer from the ashes of Stadia
      • GRC-2003-C-02097 2004-05-01T00:00:00Z
      • GRC-2013-C-05246 2009-11-26T00:00:00Z
      • Hello world!
      • Highlights of Science Launching on SpaceX CRS-15 2018-06-24T00:00:00Z
      • Hilf Al-Fudul
      • HiOperator (YC S16) Is Hiring VP of Engineering
      • HN を指します: Socketify.py: PyPy3 および Python3 用の Http/Https および WebSocket サーバー
      • How kind I blueprint a pair of buttocks?
      • How will the haj switch as international temperatures upward thrust?
      • Human gene linked to bigger brains turned into as soon as born from apparently pointless DNA
      • Hundreds of scientists publish a paper every 5 days
      • In Favor of Friction and Flexibility
      • Indicate HN: Klotho – Change into straightforward code into cloud native code
      • Indoor farming isn’t exact for the smartly off
      • Iranian assault drone came across to have parts from more than a dozen US companies
      • Israeli researcher experiences leak of 235M Twitter-linked e-mail addresses
      • Jam Stations in Low Earth Orbit
      • Jazz Is Freedom
      • JPL-20171102-TECHf-0001-ドローントリップ 人間vs機械 2017-11-17T00:00:00Z
      • JPL에서 NASA의 MSI 콘도미니엄 액셀러레이터 2022-08-25T00:00:00Z
      • JPSO extinct facial recognition abilities to arrest a man. The tech modified into once defective
      • jsc2017e136097 – 12 月 4 日,俄罗斯联邦地区公司 (Roscosmos) 的远征 54-55 号机组人员 Anton Shkaplerov 在俄罗斯 Principal person City 的加加林宇航员训练中心上向整洁的祝福者挥手致意,当时他登上了前往附近的 Chkalovsky Ai 的公共汽车2017-12-04T00:00:00Z
      • Kemble’s Cascade of Stars
      • Kepler-90 マシン (アーティストの考え) 2017-12-14T00:00:00Z
      • Koichi Wakata SpaceX 코칭 2022-06-27T00:00:00Z
      • KSC and Proud to Be Heart-Wide Diversity Tournament 2019-08-20T00:00:00Z
      • KSC ir „Proud to Be Center“ įvairovės turnyras 2019-08-20T00:00:00Z
      • AI
      • Artificial intelligence
      • Auto-Generate
      • anti-Mastodon
      • Awesome
      • BioSentinel
      • CATEGORIES
      • DC
      • Dione
      • Diversity
      • drone racing
      • drones
      • Dulles
      • education
      • Event
      • filmstock
      • Goddard
      • AI
      • Ask HN
      • autonomous
      • coding
      • computer science
      • Clusters
      • Artificial intelligence
        • Ai
        • Apple
        • Tesla
        • digital
        • MESSENGER
        • Device
        • Charts
        • change
        • Shows
        • fraud
        • Former
        • fucking
        • purge
        • Works
        • entering
        • Databases
        • Review
        • Strangely
        • Instinct
        • staff
        • Salesforce
        • intern
        • Involuntary
        • promise
        • Twilio’s
        • nisv live
        • attack
        • Iranian
        • France’s
        • prized
        • infrastructure
        • Debian-based
        • nisv s02 ep03
        • siduction
        • Awesome
        • Calculate
        • Beautiful
        • Linux
        • Start
        • dollar
        • Messier
        • Daughters
        • ‘Breakthrough’
        • obesity
        • Al-Fudul
        • sixty years
        • PyTorch
        • discloses
        • Ancient
        • Stone
        • Popup
        • design
        • public
        • expanding
        • cloud
        • forest
        • startup
        • French
        • batteries
        • fleas
        • spotted
        • Petals
        • language
        • Habitual
        • checking
        • GitHub
        • Availability
        • layers
        • inconsistencies
        • Affair
        • Quasi-War
        • Sergey
        • Irate
        • bestseller
        • necessarily
        • Scientists
        • Titan
        • twitch
        • FGS/NIRISS – Fine Guidance Sensor/Near InfraRed Imager and Slitless Spectrograph
        • infrared
        • ISIM – Integrated Science Instrument Module
        • JHU – Johns Hopkins University
        • JWST – James Webb Space Telescope
        • JWST – James Webb Space Telescope
        • absolute zero
        • Big Bang
        • Peace
        • Studio
        • Making
        • project
        • -gaps
        • warns
        • Germany
        • Braille
        • MotherDuck
        • prototype
        • Challenges
        • Infinite
        • OwnCloud
        • Novel
        • Before
        • copying
        • California
        • getting
        • States
        • Criminal
        • Brother
        • Released
        • Method
        • owners
        • ‘richsession’
        • email
        • Elements
        • Where
        • passenger
        • Hierarchy
        • Filesystem
        • economic
        • Crew Dragon
        • Bridenstine
        • Scientific
        • Heaviosity
        • Patterns
        • Unexpected
        • minimalist
        • Ecode
        • Alpha
        • Wolfram
        • Tailwind
        • HiOperator
        • Klotho
        • TEAMS
        • ROBOTICS
        • Antelope
        • Sponsors
        • Ellen Gertsen
        • Transform
        • Flightcontrol
        • subscriptions
        • Internet
        • Theory-building
        • you’re
        • looking
        • Transfer
        • error’
        • Tails
        • migrate
        • Building
        • Bitmovin
        • Remote
        • PhotoRoom
        • Company
        • Taking
        • Ribbon
        • BibDesk
        • Android
        • profilers
        • phones
        • modern
        • Mastercard
        • private
        • We’ve
        • reportedly
        • Microsoft
        • Artsy
        • trades
        • skilled
        • Polygon
        • Flying
        • Princeton
        • Codemods
        • Coaching
        • UK’s
        • popping
        • sourced
        • Wikipedia
        • admin
        • Indoor
        • quick
        • Mysterious
        • Spotify
        • Fancy
        • Spotify
        • Weird
        • mail-order
        • Recipients
        • computer
        • optics
        • partnership
        • commercial
        • telemetry
        • Crisis
        • billionaire
        • Party
        • Onelab
        • Faster
        • general
        • Adobe’s
        • database
        • FinanceDatabase
        • Numerical
        • Freedom
        • concrete
        • Stacks
        • computer science
        • NASA
    • Show HN
    • Ask HN
    • Technology
    • computer science
    • coding
    • Clusters
    • AI
    • drones
    • autonomous
    • Auto-Generate
    • Ask HN
    • Show HN
    • The SMART Science™
    • google
  • CRYPTO
    • Blockchain
    • farming
    • Billionaires
    • FTX’s
    • FinanceDatabase: A database of 300.000 symbols (ETFs, Currencies, Crypto)
  • Artificial intelligence
  • Toggle search form
  • यह तस्वीर NOAA DSCOVR अंतरिक्ष यान पर NASA के EPIC कैमरे द्वारा ली गई थी 2022-12-31 07:53:16
    यह तस्वीर NOAA DSCOVR अंतरिक्ष यान पर NASA के EPIC कैमरे द्वारा ली गई थी 2022-12-31 07:53:16 Ask HN
  • 巨大な黄色いロボットがIBMを救った時のことを思い出す(2021)
    巨大な黄色いロボットがIBMを救った時のことを思い出す(2021) anti-Mastodon
  • 「私が本当に感じたり、不気味になったりするのを想像するだけで、あなたが単にセーブすることはもうありません」:ChatGPTへのインタビュー
    「私が本当に感じたり、不気味になったりするのを想像するだけで、あなたが単にセーブすることはもうありません」:ChatGPTへのインタビュー anti-Mastodon
  • 移動可能な Linux バイナリを作成するための簡単なアイデア
    移動可能な Linux バイナリを作成するための簡単なアイデア Artificial intelligence
  • अंतरिक्ष यात्री अभ्यास 3 2022-08-18T00:00:00Z
    अंतरिक्ष यात्री अभ्यास 3 2022-08-18T00:00:00Z Ask HN
  • ChatGPT の異常なトークン
    ChatGPT の異常なトークン Anomalous
  • KSC-05pd-0190 2005-02-01T00:00:00Z
    KSC-05pd-0190 2005-02-01T00:00:00Z Ask HN
  • Twitterのライバルの計画を模索するメタ
    Twitterのライバルの計画を模索するメタ anti-Mastodon
  • スペースXは最初のスターシップ軌道飛行のために3月を目指しています
    スペースXは最初のスターシップ軌道飛行のために3月を目指しています anti-Mastodon
  • Extinct Stone Instruments Thought to Be Made by Folk Crafted by Monkeys
    Extinct Stone Instruments Thought to Be Made by Folk Crafted by Monkeys NASA
  • Curiositas Front Hazard Fuga Camerae 2012-08-06 Ask HN
  • WinRAR の RAR5 復元データの有効性を分析する (2022)
    WinRAR の RAR5 復元データの有効性を分析する (2022) Analysing
  • 宇宙の次の 3D マップの成長
    宇宙の次の 3D マップの成長 anti-Mastodon
  • 最近 – シーメンス AG の会長が 10 億人に「肉の摂取をやめる」よう呼びかける
    最近 – シーメンス AG の会長が 10 億人に「肉の摂取をやめる」よう呼びかける anti-Mastodon
  • Preact Signals と React のベンチマーク比較
    Preact Signals と React のベンチマーク比較 anti-Mastodon
G-3PO: A protocol droid for Ghidra, or GPT-3 for reverse-engineering

G-3PO: A protocol droid for Ghidra, or GPT-3 for reverse-engineering

Posted on January 4, 2023 By 📢 ℕ𝕒𝕤𝕒𝕣𝕖™

(A Script that Solicits GPT-3 for Comments on Decompiled Code)

“This is the droid you’re procuring for,” mentioned Wobi Kahn Bonobi.

On this put up, I introduce a brand recent Ghidra script that elicits high-stage explanatory feedback for decompiled feature code from the GPT-3 large language model. This script is named G-3PO. In the first few sections of the put up, I discuss the incentive and rationale for constructing this kind of tool, in the context of present computerized tooling for tool reverse engineering. I investigate cross-take a look at at what many of our tools — disassemblers, decompilers, and so on — beget on the total, insofar as they’re recurrently thought to be computerized paraphrase or translation tools. I utilize a tiny little bit of time trying at how correctly (or poorly) GPT-3 handles these diverse duties, after which sketch out the manufacture of this recent tool.

For these who must pretty skip the dialogue and get your self space up with the tool, if truth be told feel free to scroll all of the type down to the final share, after which work backwards from there if you occur to like.

The Github repository for G-3PO might maybe well well be stumbled on HERE.

At the attach relate of things, the enviornment of reverse engineering appears to be like like a fertile place for applying machine finding out ways. ML tends to excel, in spite of all the pieces, at considerations where getting the gist of things counts, where the emphasis is on selecting out patterns which might maybe otherwise lag ignored, and where error is both tolerable or might maybe well well be corrected by various approach. This roughly free and conjectural pattern recognition is where reverse engineering begins. We initiate by trying to get a if truth be told feel for a tool, a strategy of the tactic in which it hangs collectively, after which attempt and tunnel down. Impressions might maybe well well be fraudulent, needless to claim, however here’s a discipline where they’re with out problems examined, and where perfect abstractions are each sought and mistrusted.

A still from the movie, Matrix, showing Cipher in front of monitors displaying arcane data dumps, and saying “You get used to it. I don’t even see the code anymore. All I see is blonde, brunette, redhead,” or something like that.
You get damaged-all of the type down to it.

The purpose, in spite of all the pieces, is to love (some share of) a tool better than its developers assemble, to fraction collectively its specification and where the specification breaks down.

At many phases alongside the near, you might maybe dispute that what the reverse engineer is doing is browsing for methods to paraphrase what they’re trying at, or translate it from one language into one other.

We might maybe well initiate, as an instance, with an opaque binary “blob” (to use a semi-technical timeframe for unanalyzed files) that we dumped off a router’s NAND storage. Step one will doubtless be to tease out its file structure, and thru a route of of educated guesses and experiments, uncover a on the discipline of parse it. Presumably it turns out to comprise a squashfs file map, containing the router’s firmware. We beget now diverse tools, like Binwalk, to lend a hand with this stage of things, which we know can’t be relied on fully however which might maybe well present priceless hints, or even get us to the next stage.

Dispute we then unpack the firmware, mount it as a filesystem, after which explore the contents. Presumably we uncover an attention-grabbing-trying utility binary, called something like telnetd_startup. In place of reading it as an opaque blob of bits, we investigate cross-take a look at for a on the discipline of manufacture sense of it, in total initiating by parsing its file constructing (let’s dispute it’s an ELF) and disassembling it — translating the binary file into a chain, or better, a directed graph of meeting directions. For this step we might maybe well presumably lean on tools like objdump, rizin, IDA Pro (if we beget now an expense tale), or, my private favourite, Ghidra. There’s room for error here as correctly, and on occasion even essentially among the finest tools we beget now will get off on the bad foot and parse files as code, or misjudge the offset of a chain of directions and plot a garbled itemizing, however you get to acknowledge the forms of errors that these forms of tools are prone to, especially when facing unknown file codecs. You be taught diverse heuristics and tips of thumb to crop and superior these errors. However tools that might maybe well automate the interpretation of a binary blob into readable meeting are alternatively considerable — to the extent that if you occur to beget been faced with a binary that damaged-down an unknown instruction space, your first precedence as a reverse engineer might maybe well very correctly be to determine write a minimal of a incorrect and incomplete disassembler for it.

The disassembly itemizing of a binary presents us a nice grained image of its utility good judgment, and on occasion that’s the furthest that computerized tools can desire us. However it with out a doubt’s serene a a ways inform from the code that its developer might maybe well presumably beget been working with — very few applications are written in meeting this day, and its easy to get misplaced in the weeds with out a bigger-stage vantage level. This could well well be where the reverser begins the patient book work of discovering attention-grabbing procedure of the binary — procedure where its handling person input, as an instance — by stepping thru the binary with a debugger like GDB (maybe with the lend a hand of an emulator, like QEMU), after which annotating the disassembly itemizing with feedback. In doing so the reverser tries to plot a high-stage paraphrase of this procedure.

This present day, on the different hand, we recurrently beget access to at least one other space of tools called decompilers, that might maybe well a minimal of roughly translate the dissassembly itemizing into something that appears like provide code, in total something like C (however prolonged with a pair of pseudo sorts, like Ghidra’s undefined and undefined* to connect missing files). (Somewhat a pair of tools, static prognosis frameworks like BAP or angr (or, internally, Ghidra or Binary Ninja), as an instance, will doubtless be damaged-all of the type down to “dispute” or translate the binary to an intermediate illustration more amenable to additional computerized prognosis, however we’ll leave these apart for now.) Decompilation is a heuristically-pushed and inexact art work, to an awfully a lot bigger extent than disassembly. When provide code (in C, as an instance) is compiled all of the type down to x86 or ARM machine code, there’s an irreversible lack of information, and intelligent support in the many route involves a tiny little bit of wager work, guided by contextual clues and constraints. When reverse engineers work with decompilers, we desire it with none consideration that the decompiler might maybe well well even be getting a minimal of a pair of things bad. However I doubt any person would dispute that they’re unhelpful. We can, and in total must, lag support to the disassembly itemizing every time wished in spite of all the pieces. And when something appears to be like fishy there, we are in a position to circulate support to the binary’s file structure, and explore if something’s been parsed incorrectly.

In my day to day work, here’s in total where computerized prognosis stops and where book annotation and paraphrase begins. I slowly be taught thru the decompiler’s output and attempt and determine, in traditional language, what the code is “supposed” to be doing, and what it’s if truth be told doing. It’s a lengthy route of of conjecture and refutation, in total keen the use of debuggers, emulators, and tracers to look at interpretations of the code. I might maybe well probe the running or emulated binary with diverse inputs and survey the effects. I might maybe well even attempt and assemble this in a brute power near, at scale, “fuzzing” the binary and buying for anomalous behaviour. However an excellent deal of time is spent pretty including feedback to the binary in Ghidra, correcting misleading kind files and coming up with informative names for the capabilities and variables in play (especially if the binary’s been stripped and symbols are missing). Let’s name this the route of of annotation.

We might maybe well look for that many of the computerized phases in the reverse engineer’s job — parsing and unpacking the firmware blob, disassembling binary executables, after which decompiling them — can a minimal of loosely be described as processes of translation or paraphrase. And the a similar might maybe well well be mentioned for annotation.

This brings us support to machine finding out.

If there’s one thing that large language items, like OpenAI’s GPT-3, beget shown themselves to be especially actual at, it’s paraphrase — whether or now not it’s a matter of translating between one language and one other, summarising an present files inferior, or rewriting a text in the kind of a explicit creator. As soon as you look for this, as I did final week while flitting backward and ahead between a project I was engaged on in Ghidra and a browser tab opened to ChatGPT, it must seem pure to notice how an LLM handles the forms of “paraphrasing” eager on a normal tool reverse engineering workflow.

The instance I’ll be working with here, unless otherwise mighty, is a feature carved from a firmware binary I dumped from a Canon ImageClass MF743Cdw printer.

GPT-3 Makes a Miserable Disassembler

Let’s initiate with disassembly:

A screenshot showing me prompting ChatGPT with a hexdump of ARM machine code.

Disassembly appears to be like to plunge squarely initiate air of ChatGPT’s scope, which isn’t surprising. It used to be professional on “pure language” in the immense sense, in spite of all the pieces, and now not on binary dumps.

A screenshot showing ChatGPT offering a fallacious disassembly of the ARM binary snippet.
A failed attempt by ChatGPT to disassemble some ARM machine code.

The GPT-3 text-davinci-003 model does no better:

A screenshot showing the text-davinci-003 model fail to provide an accurate disassembly of the hexdumped binary provided.

This, again, might maybe well well be large, if it weren’t fully bad. Right here’s what capstone (accurately) returns for the a similar input:

0x44b2d4b0: cmp r2, #3
0x44b2d4b4: bls #0x44b2d564
0x44b2d4b8: ands ip, r0, #3
0x44b2d4bc: beq #0x44b2d4e4
0x44b2d4c0: ldrb r3, [r1], #1
0x44b2d4c4: cmp ip, #2
0x44b2d4c8: add r2, r2, ip
0x44b2d4cc: ldrbls ip, [r1], #1
0x44b2d4d0: strb r3, [r0], #1
0x44b2d4d4: ldrblo r3, [r1], #1
0x44b2d4d8: strbls ip, [r0], #1
0x44b2d4dc: sub r2, r2, #4
0x44b2d4e0: strblo r3, [r0], #1
0x44b2d4e4: ands r3, r1, #3
0x44b2d4e8: beq #0x44b36318
0x44b2d4ec: subs r2, r2, #4
0x44b2d4f0: blo #0x44b2d564
0x44b2d4f4: ldr ip, [r1, -r3]!
0x44b2d4f8: cmp r3, #2
0x44b2d4fc: beq #0x44b2d524
0x44b2d500: bhi #0x44b2d544
0x44b2d504: lsr r3, ip, #8
0x44b2d508: ldr ip, [r1, #4]!
0x44b2d50c: subs r2, r2, #4
0x44b2d510: orr r3, r3, ip, lsl #24
0x44b2d514: str r3, [r0], #4
0x44b2d518: bhs #0x44b2d504
0x44b2d51c: add r1, r1, #1
0x44b2d520: lsr r3, ip, #0x10
0x44b2d524: ldr ip, [r1, #4]!
0x44b2d528: subs r2, r2, #4
0x44b2d52c: orr r3, r3, ip, lsl #16
0x44b2d530: str r3, [r0], #4
0x44b2d534: bhs #0x44b2d520
0x44b2d538: add r1, r1, #2
0x44b2d53c: lsr r3, ip, #0x18
0x44b2d540: ldr ip, [r1, #4]!
0x44b2d544: subs r2, r2, #4
0x44b2d548: orr r3, r3, ip, lsl #8
0x44b2d54c: str r3, [r0], #4
0x44b2d550: bhs #0x44b2d53c
0x44b2d554: add r1, r1, #3
0x44b2d558: lsls r2, r2, #0x1f
0x44b2d55c: ldrbhs r3, [r1], #1
0x44b2d560: ldrbhs ip, [r1], #1
0x44b2d564: ldrbmi r2, [r1], #1
0x44b2d568: strbhs r3, [r0], #1
0x44b2d56c: strbhs ip, [r0], #1
0x44b2d570: strbmi r2, [r0], #1
0x44b2d574: bx lr
0x44b2d578: subs r2, r2, #0x20
0x44b2d57c: blo #0x44b2d5b0
0x44b2d580: cmp r2, #0x80
0x44b2d584: blo #0x44b2d5a0
0x44b2d588: pld [r1, #0x80]
0x44b2d58c: ldm r1!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d590: sub r2, r2, #0x20
0x44b2d594: stm r0!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d598: ldm r1!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d59c: subs r2, r2, #0x20
0x44b2d5a0: stm r0!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d5a4: bhs #0x44b2d598
0x44b2d5a8: lsls ip, r2, #0x1c
0x44b2d5ac: ldmhs r1!, {r3, r4, ip, lr}
0x44b2d5b0: stmhs r0!, {r3, r4, ip, lr}
0x44b2d5b4: ldmmi r1!, {r3, r4}
0x44b2d5b8: stmmi r0!, {r3, r4}
0x44b2d5bc: lsls ip, r2, #0x1e
0x44b2d5c0: ldrhs r3, [r1], #4
0x44b2d5c4: strhs r3, [r0], #4
0x44b2d5c8: bxeq lr
0x44b2d5cc: lsls r2, r2, #0x1f
0x44b2d5d0: ldrhhs r3, [r1], #2
0x44b2d5d4: ldrbmi r2, [r1], #1
0x44b2d5d8: strhhs r3, [r0], #2
0x44b2d5dc: strbmi r2, [r0], #1
0x44b2d5e0: bx lr

GPT-3 Makes a Beautiful Gracious Decompiler

Issues investigate cross-take a look at a lot brighter after we flip to decompilation.

If I first disassemble the binary the use of Capstone after which build a inquire of to the LLM to decompile the result, I initiate to get surprisingly actual and informative answers.

>>> prompt_openai_model(f"Right here is a pair of ARM meeting code:nn```n{s}n```nnIt might maybe well well be decompiled into C code as follows:nn```n",
model="text-davinci-003")

This affords us the following:

if (r2==3) {
if ((r0 & 3)==0) {
while (r2> 0) {
r3=*r1 ;
*r0 =r3;
r2--;
}
} else {
while (r2> 0) {
if ((r0 & 3)==2) {
r3=*r1 ;
*r0 =r3;
r2--;
} else {
ip=*r1 ;
r2 -=4;
if (r3==2) {
*r0 =(ip } else if (r3 *r0 =(ip r1 ;
} else {
*r0 =(ip r1 =3;
}
}
}
}
} else {
while (r2> 0) {
if (r2>=32) {
if (r2>=128) {
for (int i=0; i *r0 =*r1 ;
*r0 =*r1 ;
*r0 =*r1 ;
*r0 =*r1 ;
r2 -=32;
}
} else {
*r0 =*r1 ;
*r0 =*r1 ;
*r0 =*r1 ;
*r0 =*r1 ;
r2 -=32;
}
} else {
if (r2>=16) {
*r0 =*r1 ;
*r0 =*r1 ;
*r0 =*r1 ;
*r0 =*r1 ;
r2 -=16;
} else {
if (r2>=8) {
*r0 =*r1 ;
*r0 =*r1 ;
r2 -=8;
} else {
if (r2>=4) {
*r0 =*r1 ;
r2 -=4;
} else {
if (r2>=2) {
*r0 =*r1 ;
r2 -=2;
} else {
*r0 =*r1 ;
r2--;
}
}
}
}
}
}
}

Overview this with what Ghidra presents us:


undefined8 FUN_44b2d4b0(uint *param_1,uint *param_2,uint param_3,uint param_4)

{
uint *puVar1;
uint *puVar2;
byte *pbVar3;
uint *puVar4;
byte bVar5;
undefined2 uVar6;
uint uVar7;
uint uVar8;
uint uVar9;
uint uVar10;
uint uVar11;
uint uVar12;
byte bVar13;
uint in_r12;
uint uVar14;
uint uVar15;
uint uVar16;
bool bVar17;
bool bVar18;

if (3 uVar14=param_1 & 3;
in_r12=uVar14;
if (uVar14 !=0) {
bVar5=*param_2;
puVar2=param_2 1;
if (uVar14 puVar2=param_2 2;
in_r12=*(param_2 1);
}
*param_1=bVar5;
param_2=puVar2;
if (uVar14 param_2=puVar2 1;
bVar5=*puVar2;
}
puVar2=param_1 1;
if (uVar14 puVar2=param_1 2;
*(param_1 1)=in_r12;
}
param_3=(param_3 uVar14) - 4;
param_1=puVar2;
if (uVar14 param_1=puVar2 1;
*puVar2=bVar5;
}
}
param_4=param_2 & 3;
if (param_4==0) {
uVar14=param_3 - 0x20;
if (0x1f for (; 0x7f HintPreloadData(param_2 0x20);
uVar7=*param_2;
uVar8=param_2[1];
uVar9=param_2[2];
uVar10=param_2[3];
uVar11=param_2[4];
uVar12=param_2[5];
uVar15=param_2[6];
uVar16=param_2[7];
param_2=param_2 8;
*param_1=uVar7;
param_1[1]=uVar8;
param_1[2]=uVar9;
param_1[3]=uVar10;
param_1[4]=uVar11;
param_1[5]=uVar12;
param_1[6]=uVar15;
param_1[7]=uVar16;
param_1=param_1 8;
}
assemble {
param_4=*param_2;
uVar7=param_2[1];
uVar8=param_2[2];
uVar9=param_2[3];
uVar10=param_2[4];
uVar11=param_2[5];
uVar12=param_2[6];
uVar15=param_2[7];
param_2=param_2 8;
bVar17=0x1f uVar14=uVar14 - 0x20;
*param_1=param_4;
param_1[1]=uVar7;
param_1[2]=uVar8;
param_1[3]=uVar9;
param_1[4]=uVar10;
param_1[5]=uVar11;
param_1[6]=uVar12;
param_1[7]=uVar15;
param_1=param_1 8;
} while (bVar17);
}
if (uVar14>> 4 & 1) {
param_4=*param_2;
uVar7=param_2[1];
uVar8=param_2[2];
uVar9=param_2[3];
param_2=param_2 4;
*param_1=param_4;
param_1[1]=uVar7;
param_1[2]=uVar8;
param_1[3]=uVar9;
param_1=param_1 4;
}
if (uVar14 param_4=*param_2;
uVar7=param_2[1];
param_2=param_2 2;
*param_1=param_4;
param_1[1]=uVar7;
param_1=param_1 2;
}
puVar1=param_1;
puVar2=param_2;
if (uVar14>> 2 & 1) {
puVar2=param_2 1;
param_4=*param_2;
puVar1=param_1 1;
*param_1=param_4;
}
uVar6=param_4;
if ((uVar14 & 3) !=0) {
bVar18=uVar14>> 1 & 1;
uVar14=uVar14 bVar17=uVar14 puVar4=puVar2;
if (bVar18) {
puVar4=puVar2 2;
uVar6=*puVar2;
}
puVar2=puVar4;
if (bVar17) {
puVar2=puVar4 1;
uVar14=*puVar4;
}
puVar4=puVar1;
if (bVar18) {
puVar4=puVar1 2;
*puVar1=uVar6;
}
puVar1=puVar4;
if (bVar17) {
puVar1=puVar4 1;
*puVar4=uVar14;
}
return CONCAT44(puVar2,puVar1);
}
return CONCAT44(puVar2,puVar1);
}
bVar17=3 param_3=param_3 - 4;
if (bVar17) {
param_2=param_2 - param_4;
in_r12=*param_2;
puVar2=param_1;
if (param_4==2) {
do {
puVar1=param_2;
param_4=in_r12>> 0x10;
param_2=puVar1 1;
in_r12=*param_2;
bVar17=3 param_3=param_3 - 4;
param_4=param_4 | in_r12 param_1=puVar2 1;
*puVar2=param_4;
puVar2=param_1;
} while (bVar17);
param_2=puVar1 6;
}
else if (param_4 do {
puVar1=param_2;
param_4=in_r12>> 8;
param_2=puVar1 1;
in_r12=*param_2;
bVar17=3 param_3=param_3 - 4;
param_4=param_4 | in_r12 param_1=puVar2 1;
*puVar2=param_4;
puVar2=param_1;
} while (bVar17);
param_2=puVar1 5;
}
else {
do {
puVar1=param_2;
param_4=in_r12>> 0x18;
param_2=puVar1 1;
in_r12=*param_2;
bVar17=3 param_3=param_3 - 4;
param_4=param_4 | in_r12 param_1=puVar2 1;
*puVar2=param_4;
puVar2=param_1;
} while (bVar17);
param_2=puVar1 7;
}
}
}
bVar13=in_r12;
bVar5=param_4;
bVar18=param_3>> 1 & 1;
param_3=param_3 bVar17=param_3 if (bVar18) {
pbVar3=param_2 1;
bVar5=*param_2;
param_2=param_2 2;
bVar13=*pbVar3;
}
puVar2=param_2;
if (bVar17) {
puVar2=param_2 1;
param_3=*param_2;
}
if (bVar18) {
pbVar3=param_1 1;
*param_1=bVar5;
param_1=param_1 2;
*pbVar3=bVar13;
}
puVar1=param_1;
if (bVar17) {
puVar1=param_1 1;
*param_1=param_3;
}
return CONCAT44(puVar2,puVar1);
}

These look, at first blush, pretty close to one another. In both cases what this function looks like is something like a compiler-optimized memcpy, implemented in such a way as to exploit whatever common alignment the source and destination pointer might have.

Now, as far as machine code goes, Ghidra’s decompiler is already quite good, and there’s no real need to put a rather opaque and heuristic LLM in its place. Where LLM-driven approximate decompilations can be quite useful is when dealing with a bytecode for which a good decompiler isn’t immediately available. Another researcher on the Tenable Zero Day team, Jimi Sebree, was able to coax ChatGPT into producing reasonably useful (if imperfect) decompilations of Lua bytecode while reversing a certain router’s LuCI front-end. This took us from something like this:

A screenshot of some Lua bytecode.
An (incomplete) snippet of disassembled Lua bytecode, decompiled by ChatGPT below.

To something like this:

module("luci.controller.admin.access_control", package.seeall)
local uci=require("luci.model.uci")
local controller=require("luci.model.controller")
local sys=require("luci.sys")
local form=require("luci.tools.form")
local debug=require("luci.tools.debug")
local client_mgmt=require("luci.model.client_mgmt")
local checktypes=require("luci.model.checktypes")
local cur=uci.cursor()
form=form.Form(cur, {"mac"})
local json=require("luci.json")
local http=require("luci.http")
local util=require("luci.util")
local ctl=controller.Controller()
local mgmt=client_mgmt.Client_mgmt()

function index()
entry({"admin", "access_control"}, call("action_index"), _("Access Control"), 3)
entry({"admin", "access_control", "details"}, call("action_details"), nil)
entry({"admin", "access_control", "details", "add"}, call("action_details_add"), nil)
entry({"admin", "access_control", "details", "add_from_pool"}, call("action_details_add_from_pool"), nil)
entry({"admin", "access_control", "details", "delete"}, call("action_details_delete"), nil)
end

function action_index()
local mac_list=mgmt:get_mac_list()
http.prepare_content("application/json")
http.write_json(mac_list)
end

function action_details()
local mac=http.formvalue("mac")
if not mac then
return
end
http.prepare_content("application/json")
http.write_json(mgmt:get_mac_details(mac))
end

As far as machine code binaries go, however, it seems that Ghidra already has decompilation well in hand. What the Ghidra framework could use, it seems, is a way of providing high-level explanations of what a particular decompiled function is doing. And this is just the sort of paraphrasing at which large language models excel.

GPT-3’s Not Bad at Commentary

So now here comes the interesting part. Let’s go back to that function we dumped from the Canon ImageClass printer at address 0x44b2d4b0, take the decompiled code that Ghidra gave us, and ask the LLM to provide it with a high-level gloss, explaining in “natural language” just what the code is doing.

>>> print(prompt_openai_model(f"""Right here is a pair of decompiled C code:```
{d}
```
Give me a high-stage explanation of what this code is doing, and imply a title for the feature in inquire of."""))
This code is performing a memory replica operation. It's copying files from one memory space to at least one other. The code is looping thru the provide memory space, and copying every byte to the vacation space memory space. It's additionally taking into tale any alignment elements that might maybe well arise.A first price title for this feature will doubtless be 'memory_copy'.

This is a aesthetic astute rooster’s-spy look. The feature is, indeed, a memory copying feature, optimized in this kind of near as to use any alignment the memory pointers half.

To recap, we’ve noticed how the workflow of a tool reverse engineer involves (among various things) a chain duties that might maybe well well be loosely grouped collectively as operations of translation or paraphrase. These encompass, however usually are now not dinky to,

  1. files carving and the parsing of filesystems and files
  2. disassembly
  3. decompilation
  4. annotation and commentary

The stylish-day reverse engineer is equipped with tools that might maybe well automate the first three duties — albeit by no approach in a foolproof sort, and the reverser who depends fully on their computerized toolbox is now not any reverser at all. That the abstractions we deal in deceive us is something reverse engineers desire with none consideration, in spite of all the pieces, and this goes for the abstractions our tools dispute no lower than the abstractions our targets use.

What these swiftly and dirty experiments with an LLM imply is that that the fourth route of listed here, the paraphrase of disassembled or decompiled code into high-stage commentary, might maybe well well be assisted by computerized tooling as correctly.

And here’s pretty what the G-3PO Ghidra script does.

The output of this kind of tool, needless to claim, would might maybe well serene be comparatively checked. Taking its soundness with none consideration might maybe well well be a mistake, pretty because it’d be a mistake to build too noteworthy religion in the decompiler. We might maybe well serene have faith this kind of tool, backed because it’s a ways by an opaque LLM, a ways much less than we have faith decompilers, in truth. Fortunately reverse engineering is the get of enviornment where we don’t want to have faith noteworthy at all. It’s an in truth skeptical craft. The reverser’s correctly mindful that every non-trivial abstraction leaks, and that complex hardware and power programs now not frequently behave as anticipated. The identical healthy skepticism might maybe well serene consistently extend to our tools.

Developing the G-3PO Ghidra Script

Developing the G-3PO Ghidra script used to be surprisingly easy. The lion’s half of the work used to be pretty a matter of trying up diverse APIs and fiddling with a unbiased a tiny awkward model ambiance.

One of many weaknesses in Ghidra’s Python scripting make stronger is that it’s restricted to the ragged and unmaintained “Jython” engine, a Python 2.7 interpreter that runs on the Java Digital Machine. One option would beget been to manufacture use of the Ghidra to Python Bridge, a supplementary Ghidra script that lets you work alongside with Ghidra’s Jython interpreter from the Python 3 ambiance of your different, over a native socket, however since my wants beget been aesthetic spare, I didn’t desire to overburden the project with extra dependencies. All I if truth be told wished from the OpenAI Python module in spite of all the pieces used to be a straightforward on the discipline of serialise, send, to find and parse HTTP requests that conform to the OpenAI API. Jython’s boundaries being what they’re, I concern up doing this with the curl binary, called thru Jython’s constructed-in subprocess module and parsing the answer with thejson module.

def openai_request_cmd(suggested, temperature=0.19, max_tokens=MAXTOKENS, model=MODEL): 
openai_api_key=os.getenv("OPENAI_API_KEY")
if openai_api_key is None:
logging.error("OpenAI API key now not stumbled on in ambiance variables!")
return None
files={
"model": MODEL,
"suggested": re.sub(r"(? "max_tokens": max_tokens,
"temperature": temperature
}
json_data=json.dumps(data)
url="https://api.openai.com/v1/completions"
cmd=["curl",
url,
"-H", "Content-Type: application/json",
"-H", "Authorization: Bearer {openai_api_key}".format(openai_api_key=openai_api_key),
"-d", json_data]
return cmd

def openai_request(prompt, temperature=0.19, max_tokens=MAXTOKENS, model=MODEL):
cmd=openai_request_cmd(prompt, temperature=temperature, max_tokens=max_tokens)
cmdstr=" ".join(cmd)
logging.info("Running command: {cmdstr}".format(cmdstr=cmdstr))
res=sp.Popen(cmd, shell=False, stdout=sp.PIPE, stderr=sp.PIPE)
exitcode=res.wait()
out=res.stdout.read()
err=res.stderr.read()
if exitcode !=0:
logging.error("OpenAI request failed with exit code {exitcode}".format(exitcode=exitcode))
logging.error("Error: {err}".format(err=err))
return None
logging.info("OpenAI request succeeded with exit code {exitcode}".format(exitcode=exitcode))
logging.info("Response: {out}".format(out=out))
try:
return json.loads(out)
except Exception as e:
logging.error("Failed to parse JSON response: {e}".format(e=e))
return None

This is good enough to avoid any dependency on the Python openai library.

The prompt that G-3PO sends to the LLM is pretty basic, and there’s certainly room to tweak it a little in search of better results. What I’m currently using looks like this:

prompt="""
Below is some C code that Ghidra decompiled from a binary that I'm trying to
reverse engineer.

```
{c_code}
```
Please provide a detailed explanation of what this code does, in {style},
that might be useful to a reverse engineer. Explain your reasoning as much
as possible. {extra}

Finally, suggest suitable names for this function and its parameters.
""".format(c_code=c_code, style=STYLE, extra=EXTRA)

The c_code interpolated into the prompt is taken from the output of the Ghidra decompiler, for the function the user is currently inspecting. Quite usefully, this output includes any comments, variable names, or type annotations that the user has already added to the code listing, allowing the LLM to build on the user’s work. The exception is the plate comment positioned at the beginning of the function. This is where G-3PO will place its own comments, and if there’s already a comment in that slot it will refuse to clobber it, unless that comment begins with the phrase OpenAI GPT-3 generated comment.

G-3PO in Action

Here’s how the script looks in action, commenting on the same function we looked at in the previous section, the infelicitously named FUN_44b2d4b0(), to which G-3PO suggests we should give the name CopyMemory():

A screenshot showing G-3PO providing commentary for a function in Ghidra.

Here’s what the comment says, so that you don’t have to squint:

OpenAI GPT-3 generated comment, take with a grain of salt:

This code is a function that copies data from one memory location to another. It takes four parameters: two pointers to memory locations (param_1 and param_2), and two integers (param_3 and param_4).

The function first checks if the value of param_3 is greater than 3. If it is, it checks if the first pointer (param_1) is aligned to a 4-byte boundary. If it is not, it copies the first byte from the second pointer (param_2) to the first pointer (param_1), and then checks if the second pointer (param_2) is aligned to a 4-byte boundary. If it is not, it copies the next byte from the second pointer (param_2) to the first pointer (param_1). It then subtracts the number of bytes copied from param_3 and adjusts the pointers accordingly.

The function then enters a loop, where it copies 8 bytes from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly. This loop is repeated until the value of param_3 is less than 0x20.

If the value of param_3 is less than 0x20, the function checks if the value of param_3 is greater than 3. If it is, it copies 4 bytes from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly. It then checks if the value of param_3 is greater than 1. If it is, it copies 2 bytes from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly. Finally, it checks if the value of param_3 is greater than 0. If it is, it copies 1 byte from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly.

The function then returns the adjusted pointers.

A suitable name for this function is “CopyMemory”. The variables bearing
default names can be renamed as follows:
— param_1 -> destPtr
— param_2 -> srcPtr
— param_3 -> numBytes
— param_4 -> alignment

Model: text-davinci-003, Temperature: 0.19

A shut investigate cross-take a look at at the code does indeed show something like an optimized memcpy feature. It differs from a naïve memory copying loop, dispute

for (i=0; i  destPtr[i]=srcPtr[i];
}

only in seeking to copy the largest aligned chunks in each iteration, up to a chunk size of 32 bytes. In this fashion, the processor can take advantage of the ARM ldmia instruction, which lets it read a block of memory into several registers in sequence, and the stmia instruction, which lets it write from a sequence of registers into a block of memory. Here’s what the assembly code looks like, for example, where the function is copying 32-byte (i.e., 8-word) blocks of memory at once:

                                 LAB_44b3632c                                            XREF[1]:       44b36340(j)  
44b3632c 80 f0 d1 f5 pld [r1,#0x80]
44b36330 f8 51 b1 e8 ldmia r1!,{r3,r4,r5,r6,r7,r8,r12,lr}=>LAB_454918fc
44b36334 20 20 42 e2 sub r2,r2,#0x20
44b36338 80 00 52 e3 cmp r2,#0x80
44b3633c f8 51 a0 e8 stmia r0!,{r3,r4,r5,r6,r7,r8,r12,lr}=>DAT_e1a04005
44b36340 f9 ff ff 2a bcs LAB_44b3632c

A Backdoor Lockpick, Revisited

Earlier this yr, I posted a detailed prognosis of a susceptible backdoor that I stumbled on on every Phicomm router I might maybe well get my hands on. At the heart of that prognosis used to be a nice-grained reversing of the telnetd_startup binary that I stumbled on in these routers’ firmware, initiating with the version came upon on the Phicomm K2GA router, firmware version 22.6.3.20. Let’s explore what G-3PO has to claim about a pair of of the considerable capabilities in that binary, unaided by my possess feedback, corrections, and annotations to the decompiler output (starting from scratch, that is, with the output Ghidra presents).

A screenshot of Ghidra’s decompilation of a function from the Phicomm backdoor binary.
The unannotated Ghidra decompilation for a feature in the Phicomm backdoor binary, telnetd_startup, which exams to notice if both a reset ask or the initial handshake has been got.

Right here’s a feature that the backdoor server calls to look at to notice if the patron has equipped the magic handshake, ABCDEF1234, when the relate machine is in its initial relate, or if the patron has requested to restart the relate machine.

This is what G-3PO needed to claim about this fairly straight forward feature:

G-3PO glossing a function in the Phicomm backdoor binary, telnetd_startup.
G-3PO’s commentary on one in every of the capabilities called in the Phicomm backdoor binary, telnetd_startup.

This is definitely superior, though we might maybe well serene attach that what’s considerable about this feature, for the reverse engineer, has more to assemble with the feature it performs in the relate machine enforcing the backdoor protocol as a total.

In obvious situations, we explore that G-3PO makes an unsuitable however reasonable wager — corresponding to when it mistakes the feature that generates the ephemeral keys for temporarily and permanently unlocking the backdoor for a feature that sets up file map paths. The variation, on the different hand, is fully a matter of context, and we are in a position to steer our droid — that might maybe well’t, in spite of all the pieces, explore the leisure initiate air this single block of code — in the pretty route by naming a pair of of the many capabilities thatthis feature calls.


//--------------------------------------------------------------------------------
|OpenAI GPT-3 generated commentary, desire with a grain of salt: |
| |
| This code appears to be like to be developing two various paths, one for permanent |
|storage and one for non permanent storage. The feature is first developing a buffer|
|of 88 bytes (auStack_60) after which the use of sprintf to beget two strings, |
|acStack_460 and acStack_260, which are the paths for the permanent and non permanent|
|storage, respectively. The feature then calls FUN_00401f30 with the buffer as |
|an argument, adopted by a name to FUN_00402b28 with the buffer, the permanent |
|route string, and the size of the permanent route string as arguments. This is |
|adopted by a name to FUN_00402c28 with the buffer and a fixed string as |
|arguments. The feature then repeats the a similar steps for the non permanent route |
|string. |
| This selection will doubtless be named setupPaths(). The variable auStack_60 will doubtless be |
|named buffer, acStack_460 will doubtless be named permPath, and acStack_260 will doubtless be |
|named tempPath. |
| |
|Model: text-davinci-003, Temperature: 0.19 |
--------------------------------------------------------------------------------/
/
/~
|oo )
_=/_
/
//|/.|\
|| _/ ||
|| | /| ||
# _ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_
*/

/DISPLAY WARNING: Type casts are NOT being printed */

undefined4 FUN_0040182c(void)

{
size_t sVar1;
char acStack_460 [512];
char acStack_260 [512];
undefined auStack_60 [88];

memset(auStack_60,0,0x58);
sprintf(acStack_460,"%s PERM",&DAT_00414b80);
sprintf(acStack_260,"%s TEMP",&DAT_00414b80);
FUN_00401f30(auStack_60);
sVar1=strlen(acStack_460);
FUN_00402b28(auStack_60,acStack_460,sVar1);
FUN_00402c28(auStack_60,&DAT_00414c20);
FUN_00401f30(auStack_60);
sVar1=strlen(acStack_260);
FUN_00402b28(auStack_60,acStack_260,sVar1);
FUN_00402c28(auStack_60,&DAT_00414c30);
return 0;
}

//--------------------------------------------------------------------------------
|OpenAI GPT-3 generated commentary, desire with a grain of salt: |
| |
| This code is liable for generating two MD5 hashes from two strings. The|
|first string is a concatenation of the XORED_MSG_0014b80 variable and the string|
|" PERM". The second string is a concatenation of the XORED_MSG_0014b80 variable |
|and the string " TEMP". |
| The feature begins by atmosphere an array of 88 bytes to 0. This array is damaged-down|
|to store the MD5 context. It then uses sprintf to beget two strings, |
|acStack_460 and acStack_260, which are the 2 strings mentioned above. |
| The feature then calls md5_init with the MD5 context array as an argument. |
|This initializes the MD5 context. It then calls strlen to get the size of the |
|first string, acStack_460, and passes it to md5_add alongside with the MD5 context |
|array and the first string. This adds the first string to the MD5 context. |
| The feature then calls md5_digest with the MD5 context array and the |
|take care of of the DAT_00414c20 variable as arguments. This generates an MD5 hash |
|from the first string and stores it in the DAT_00414c20 variable. |
| The feature then calls md5_init again with the MD5 context array as an |
|argument. It then calls strlen to get the size of the second string, |
|acStack_260, and passes it to md5_add alongside with the MD5 context array and the |
|second string. This adds the second string to the MD5 context. |
| The feature then calls md5_digest with the MD5 context array and the |
|take care of of the DAT_00414c30 variable as arguments. This generates an MD5 hash |
|from the second string and stores it in the DAT_00414c30 variable. |
| This selection will doubtless be named generateMD5HashesFromStrings. The variables |
|will doubtless be renamed as follows: |
| - acStack_460 -> permString |
| - acStack_260 -> tempString |
| - auStack_60 -> md5Context |
| - DAT_00414c20 -> permHash |
| - DAT_00414c30 -> tempHash |
| |
|Model: text-davinci-003, Temperature: 0.19 |
--------------------------------------------------------------------------------/
/
/~
|oo )
_=/_
/
//|/.|\
|| _/ ||
|| | /| ||
# _ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_
*/

/DISPLAY WARNING: Type casts are NOT being printed */

undefined4 FUN_0040182c(void)

{
size_t sVar1;
char acStack_460 [512];
char acStack_260 [512];
undefined auStack_60 [88];

memset(auStack_60,0,0x58);
sprintf(acStack_460,"%s PERM",&XORED_MSG_0014b80);
sprintf(acStack_260,"%s TEMP",&XORED_MSG_0014b80);
md5_init(auStack_60);
sVar1=strlen(acStack_460);
md5_add(auStack_60,acStack_460,sVar1);
md5_digest(auStack_60,&DAT_00414c20);
md5_init(auStack_60);
sVar1=strlen(acStack_260);
md5_add(auStack_60,acStack_260,sVar1);
md5_digest(auStack_60,&DAT_00414c30);
return 0;
}

But how do we know that these auxiliary functions are setting up an MD5 hash? Well, we could figure it out the old fashioned way, as I did when I wrote the report on the backdoor earlier this year, by recognizing the constants used in initializing the MD5 struct, or we could just let our droid have a go at it. The droid, it turns out, does a pretty good job:


//--------------------------------------------------------------------------------
|OpenAI GPT-3 generated comment, take with a grain of salt: |
| |
| This code is initializing a data structure used for the MD5 hashing |
|algorithm. The data structure is an array of 5 32-bit unsigned integers, which |
|are stored in the memory pointed to by the parameter 'param_1'. The first |
|element of the array is set to 0, the second element is set to 0x67452301, the |
|third element is set to 0, the fourth element is set to 0xefcdab89, the fifth |
|element is set to 0x98badcfe, and the sixth element is set to 0x10325476. |
| A suitable name for this function could be 'init_md5_state'. The parameter |
|'param_1' could be renamed to 'md5_state', since it is a pointer to the array of|
|integers used to store the state of the MD5 hashing algorithm. |
| |
|Model: text-davinci-003, Temperature: 0.19 |
--------------------------------------------------------------------------------/
/
/~
|oo )
_=/_
/
//|/.|\
|| _/ ||
|| | /| ||
# _ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_
*/

/DISPLAY WARNING: Type casts are NOT being printed */

void FUN_00401f30(undefined4 *param_1)

{
*param_1=0;
param_1[2]=0x67452301;
param_1[1]=0;
param_1[3]=0xefcdab89;
param_1[4]=0x98badcfe;
param_1[5]=0x10325476;
return;
}

The droid provides a reasonable description of the main server loop in the backdoor binary, too:

Screenshot of G-3PO providing a comment on a decompiled function in Ghidra.
G-3PO glossing the main server loop in the Phicomm backdoor binary, telnetd_startup.

Installing and Using G-3PO

So, G-3PO is now ready for use. The only catch is that it does require an OpenAI API key, and the text completion service is unfree (as in beer, and as insofar as the model’s a black box). It is, however, reasonably cheap, and even with heavy use I haven’t spent more than the price of a cup of coffee while developing, debugging, and toying around with this tool.

To run the script:

  • get yourself an OpenAI API key
  • add the key as an environment variable by putting export OPENAI_API_KEY=whateveryourkeyhappenstobe in your ~/.profile file, or any other file that will be sourced before you launch Ghidra
  • copy or symlink c3po.py to your Ghidra scripts directory
  • add that directory in the Script Manager window
  • visit the decompiler window for a function you’d like some assistance interpreting
  • and then either run the script from the Script Manager window by selecting it and hitting the ▶️ icon, or bind it to a hotkey and strike when needed

Ideally, I’d like to provide a way for the user to twiddle the various parameters used to solicit a response from model, such as the “temperature” in the request (high temperatures — approaching 2.0 — solicit a more adventurous response, while low temperatures instruct the model to respond conservatively), all from within Ghidra. There’s bound to be a way to do this, but it seems neither the Ghidra API documentation, Google, nor even ChatGPT are offering me much help in that regard, so for now you can adjust the settings by editing the global variables declared near the beginning of the g3po.py source file:

##########################################################################################
# Script Configuration
##########################################################################################
MODEL="text-davinci-003" # Choose which large language model we query
TEMPERATURE=0.19 # Set higher for more adventurous comments, lower for more conservative
TIMEOUT=600 # How many seconds should we wait for a response from OpenAI?
MAXTOKENS=512 # The maximum number of tokens to request from OpenAI
C3POSAY=True # True if you want the cute C-3PO ASCII art, False otherwise
LANGUAGE="English" # This can also be used as a style parameter.
EXTRA="" # Extra text appended to the prompt.
LOGLEVEL=INFO # Adjust for more or less line noise in the console.
COMMENTWIDTH=80 # How wide the comment, inside the little speech balloon, should be.
C3POASCII=r"""
/~
|oo )
_=/_
/
//|/.|\
|| _/ ||
|| | /| ||
# _ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_
"""
##########################################################################################

The LANGUAGE and EXTRA parameters present the person with a straightforward on the discipline of play with the get of the LLM’s commentary. Setting sort to "in the get of a sonnet", as an instance, presents us results like this:

A screenshot of G-3PO glossing a function in sonnet form.
G-3PO glossing the principle loop feature in the Phicomm backdoor binary, telnetd_startup, in the get of a sonnet.
A screenshot of G-3PO glossing a function in sonnet form.
G-3PO glossing the optimized memory replica feature in the Canon printer firmware, in the get of a sonnet.

These are by no approach actual sonnets, however you might maybe’t beget all the pieces.

G-3PO is initiate sourced and released under an MIT license. That you’ll uncover the script in Tenable’s public Github repository HERE.

Entirely jubilant holidays and fully jubilant hacking!

𝚆𝚊𝚝𝚌𝚑 𝙽𝙾𝚆 📺

NASA, Technology Tags:Show HN

Post navigation

Previous Post: US DOJ Is Seizing Banking Assets, Robinhood Shares Linked to FTX, Court Told
Next Post: Dialog abilities essentials

Related Posts

  • Display hide HN: 教育用のオープンソース ソーシャル メディア シミュレーターを構築しました。
    Display hide HN: 教育用のオープンソース ソーシャル メディア シミュレーターを構築しました。 Awesome
  • HN のデモンストレーション: Artimator – 種類が事前定義された無料の AI 画像ジェネレーター
    HN のデモンストレーション: Artimator – 種類が事前定義された無料の AI 画像ジェネレーター Artimator
  • 残念なことを説明します。  Docker Free Groups の廃止を発表するという恐ろしい仕事をしました
    残念なことを説明します。 Docker Free Groups の廃止を発表するという恐ろしい仕事をしました AI
  • How kind I blueprint a pair of buttocks?
    How kind I blueprint a pair of buttocks? NASA
  • Impressは、オンライン安全請求書が現在の場合、英国で閉鎖されると述べています
    Impressは、オンライン安全請求書が現在の場合、英国で閉鎖されると述べています AI
  • HN のデモンストレーション: PySdfScad、openSCAD インタープリター、signed-distance-functions の使用
    HN のデモンストレーション: PySdfScad、openSCAD インタープリター、signed-distance-functions の使用 Awesome
  • 出版社 HN: 約 1.2KLOC の静的ブログ ジェネレーター
    出版社 HN: 約 1.2KLOC の静的ブログ ジェネレーター Awesome
  • Salesforce will lay off 10% of workforce as segment of restructuring
    Salesforce will lay off 10% of workforce as segment of restructuring Artificial intelligence
  • The (Nearly) Definitive 486DX/50 Article
    The (Nearly) Definitive 486DX/50 Article NASA
  • カモフラージュを表示 HN: Data Painter – データを操作するための非常に多くの機能
    カモフラージュを表示 HN: Data Painter – データを操作するための非常に多くの機能 Awesome

Recent Posts

  • TinyVG – ただし、別のバイナリ エンコードされたベクター グラフィックス構造
  • Minimal Produce Web サイトのギャラリー
  • YTsaurus: エクサバイト規模のストレージおよび処理システムが元のソースになりました
  • YTsaurus – Yandex オープン ソースの真の知識プラットフォーム
  • フロート – 基本的に完全にプログラミングに基づいており、AI と人間が一緒になる可能性があります。

Recent Comments

  1. robga on Excessive Avenue コーヒーのカフェイン ステージはさまざまで、テストでは
  2. Loic on 行方不明の放射性タブレットは、必死の捜索の後、WA奥地で偶然見つけました
  3. minihat on メタは、標準的な VR e スポーツ Echo Area をシャットダウンします
  4. dafelst on フレーム ポインターの巻き戻しによる Move 実行トレーサーのオーバーヘッドの削減
  5. cloudking on GraphGPT: 構造化されていないテキストの肯定的な素材からのレコードデータ グラフの外挿
  • 화성 과학 연구소 적층 우주선 2011-11-10T18:00:09Z
    화성 과학 연구소 적층 우주선 2011-11-10T18:00:09Z Ask HN
  • 初期のPCアーティスト、ドミニク・ボアハムが死去
    初期のPCアーティスト、ドミニク・ボアハムが死去 anti-Mastodon
  • Amazon は 2024 年に最初のカイパー インターネット衛星のオープンを目標にしています
    Amazon は 2024 年に最初のカイパー インターネット衛星のオープンを目標にしています Amazon
  • Gitlab のスタートアップ買収の割り当て
    Gitlab のスタートアップ買収の割り当て anti-Mastodon
  • Bridenstine Sworn in as NASA Administrator on This Week @NASA – April 27, 2018 2018-04-27T00:00:00Z
    Bridenstine Sworn in as NASA Administrator on This Week @NASA – April 27, 2018 2018-04-27T00:00:00Z NASA
  • SCHIRRA, WALTER, JR., 우주 비행사 – 훈련 – 원심 분리기 – PA 1960-11-22T00:00:00Z
    SCHIRRA, WALTER, JR., 우주 비행사 – 훈련 – 원심 분리기 – PA 1960-11-22T00:00:00Z Ask HN
  • 私はニール・スティーブンソン、SF 作家、オタク、[現在] 剣製作者 – AMA
    私はニール・スティーブンソン、SF 作家、オタク、[現在] 剣製作者 – AMA anti-Mastodon
  • Bumpy – 超小型の DIY ポータブル MP3 プレーヤー
    Bumpy – 超小型の DIY ポータブル MP3 プレーヤー anti-Mastodon
MAILANON
2100 MAIL
SEO
METAVERSE
BioLINK
CRYPTO MINING
CASINO
DEFI-TRACKER
StartApp Network
RAVES-MONSTER-GAME
RAVES EXCHANGE
RAVES NFT
KVANTA TV

Copyright © 2023 Nasare™🚀.

Powered by PressBook News Dark theme