An interview transcript is only useful if you can trust it enough for the next step.
That next step might be writing an article, preparing research notes, pulling customer quotes, editing a podcast, creating subtitles, sharing a client summary, or reviewing what someone actually said before you make a decision.
That is why proper interview transcription is not just “convert audio to text.” A raw transcript is a first pass. A proper transcript is a working document that stays connected to the original recording, so you can check the parts that matter.
If you are trying to learn how to properly transcribe an interview on Mac for free, the best workflow is simple:
- Record with permission where needed.
- Preserve the original audio or video file.
- Transcribe the saved recording.
- Review important sections against playback.
- Clean up names, speakers, numbers, and terminology.
- Add notes and highlights for the material you will use.
- Summarize only after review.
- Export the transcript in the right format.
Jotr is built around that kind of interview transcription workflow on Mac. It is a Mac desktop app and local transcription review workspace that turns existing audio and video files into local transcripts. Its core flow is: import file, transcribe, review with timestamp-linked playback, edit, highlight, note, summarize, and export the reviewed result.
You can start free transcription in Jotr with no account or credit card required.
Start with a recording you are allowed to use
Before any transcription step, make sure you have permission to record where needed. This is not a legal guide, and rules vary by place and situation, but it is always better to handle recording consent before the interview begins.
A good interview recording also makes transcription easier. Use a quiet room when possible, reduce background noise, and keep speakers close enough to the microphone. If the interview is remote, save the final audio or video file somewhere you can find it later.
The key point: do not rely on memory or scattered clips. Keep the source recording.
Preserve the original interview file
The source file is your reference copy. Keep it unchanged.
If the interview becomes important later, you may need to return to the exact moment where a quote, name, number, or claim was spoken. That is difficult if the recording has been renamed randomly, moved across folders, or replaced by edited versions.
A simple naming pattern helps:
2026-05-28-customer-interview-alex-rivera.m4a
or:
podcast-interview-episode-12-guest-name.mov
The exact format does not matter as much as consistency. What matters is that the transcript and recording stay easy to match.
Jotr works from existing audio and video files. Supported audio inputs include mp3, m4a, wav, aac, aiff, caf, and flac. Supported video inputs include mp4, mov, mkv, and avi.
Transcribe the saved audio or video file
Once you have the file, import it into your transcription workspace and generate the first transcript.
This is where many people stop too early. AI interview transcription can save a lot of time, but the first pass still needs human review. Interviews often include interruptions, overlapping speech, names, abbreviations, filler words, industry terms, product names, places, and numbers. Those are exactly the details that tend to matter when you quote or summarize someone.
Think of the raw transcript as a map of the interview. It gives you structure, searchability, and a place to begin. It is not yet the final source of truth.
In Jotr, you can import an existing file, transcribe it locally on your Mac, and then review the transcript with playback tied to timestamps. If you want a deeper look at the review layer after transcription, see the AI transcript editor for Mac guide.
Keep the transcript connected to playback
A proper interview transcript should help you move between text and audio quickly.
Timestamps are what make that possible. They let you search or skim the transcript, click back into the recording, and hear the exact moment again. This matters when you are checking quotes, confirming speaker intent, or deciding whether a sentence should be cleaned up.
For example, a raw transcript might say:
“The launch was in March, or maybe May, and we had around 15 teams using it.”
Before quoting that, you would want to replay the moment. Maybe the speaker actually said “mid-March.” Maybe they said “50 teams,” not “15.” Maybe the uncertainty matters and should stay in the transcript.
Without timestamps, checking that moment becomes slow. With timestamp-linked playback, review becomes part of the workflow instead of a separate hunt through the recording.
Review the sections that matter most
You do not always need to polish every sentence equally.
For many interviews, the most important review work is concentrated around:
- direct quotes
- names and titles
- company, product, or place names
- dates, times, and numbers
- technical terms
- claims you may publish or share
- moments that change the meaning of the conversation
If you are doing journalist interview transcription, this is where quote checking matters. If you are doing research interview transcription, this is where you clean up participant context and make sure observations are tied to what was actually said. If you are a podcaster or creator, this is where you find strong clips, captions, and episode notes.
A readable transcript does not need to remove every “um” or false start. But it should not distort what the speaker meant.
Clean speaker context and names
A transcript becomes much easier to use when the reader can tell who is speaking.
For a simple two-person interview, speaker labels can be plain:
Interviewer: What changed after the first month?
Guest: We realized the onboarding was too long.
For research, consulting, or customer interviews, you may want more useful labels:
Researcher:
Participant:
or:
Consultant:
Client:
If names are sensitive, use neutral labels. If names are important, check spelling against your notes or the recording. Do not assume the raw transcript got them right.
This is also where you can clean repeated speaker confusion. Interview audio can be messy. People interrupt each other, laugh, pause, or talk over a video delay. Speaker context should help the transcript make sense without pretending the conversation was more polished than it was.
Use notes and highlights to turn the transcript into working material
The point of an interview transcript is usually not the transcript itself. It is what you need to do with it.
Notes and highlights help you move from “I have the text” to “I know what matters.”
You might highlight:
- a strong quote
- a customer pain point
- a key objection
- a story worth using in an article
- a timestamp for a video clip
- a section that needs follow-up
- a theme that appears across several interviews
You might add notes such as:
- “Check exact product name before publishing”
- “Possible pull quote”
- “Ask follow-up in next interview”
- “Good section for podcast intro”
- “Compare with participant 04”
This is where proper transcription starts to overlap with real work. The transcript becomes searchable, reviewable material rather than a wall of text. If the main output you need is notes, the related guide on how to turn audio recordings into notes on Mac goes deeper on that workflow.
Jotr supports this review flow with editing, highlighting, notes, timestamp-linked playback, summaries, and exports from the reviewed transcript.
Keep private interview projects on your Mac
Interviews can contain unreleased stories, customer feedback, research material, client context, or private names. That does not turn a transcription app into a legal, compliance, or source-protection system, but it does make project handling matter.
Jotr projects are created, stored, and processed on the Mac. Jotr has no account system, no cloud workspace, and no app backend for user work. For a broader explanation of that category, see the private Mac transcription workflow.
A simple interview transcript template
Here is a practical interview transcript example you can adapt:
Interview Transcript
Interview: Customer research interview
Date: May 28, 2026
Recording: Original audio file saved separately
Participants: Researcher, Participant
Review status: Important quotes and names checked against playback
Transcript
[00:00:12] Researcher: Thanks for joining. To start, can you tell me what you were trying to do before you found the product?
[00:00:21] Participant: We were trying to collect feedback from customers after onboarding, but most of it was scattered across calls and notes.
Note: Good summary of the original problem.
[00:01:08] Researcher: What made that difficult?
[00:01:12] Participant: The hard part was not collecting feedback. It was finding the same issue repeated across different conversations.
Highlight: Strong quote for research summary.
This kind of interview transcript template is not complicated. It keeps the basics visible: timestamp, speaker, transcript text, and review notes.
Summarize after review, not before
Summaries are useful, but they should come after you have reviewed the transcript enough to trust the important parts.
A summary made from an unreviewed transcript can carry forward small errors. If a name, number, or quote is wrong in the transcript, the summary may repeat or smooth over that mistake. That can be especially risky when you are preparing interview notes for a client, research team, article, or public episode description.
Jotr’s Summary Beta is based on the reviewed transcript. It can help create a first-pass overview or notes draft, but it should not be treated as final verified interview notes. Use it to speed up orientation, then check important claims and quotes against the transcript and playback.
A good summary might include:
- main topics discussed
- key quotes to consider
- open questions
- decisions or next steps
- themes worth comparing across interviews
The reviewed transcript remains the source you return to.
Export based on what you need next
The right export format depends on what you plan to do with the transcript.
Use plain text when you want a simple copyable transcript for notes, search, or pasting into another tool.
Use Markdown when you are writing, publishing, or organizing notes in a Markdown-based workflow.
Use Word/DOCX when you need to hand the transcript to a client, editor, professor, teammate, or stakeholder who expects a document. The focused guide on how to export a transcript to Word on Mac covers that path.
Use SRT or VTT when the interview will become captions or subtitles.
Use timestamped text, timestamped Markdown, or timestamped Word/DOCX when you want the transcript to remain easy to verify against the recording.
Jotr supports raw transcript exports as Plain Text, SRT, and VTT. Reviewed transcript exports include Plain Text, timestamped text, SRT, VTT, Markdown, timestamped Markdown, Word/DOCX, and timestamped Word/DOCX. Summary exports are available as TXT, Markdown, and DOCX. For the broader review, notes, and export layer, see AI transcript review, notes, and export.
What “proper” really means
Proper interview transcription does not mean making the transcript look perfect. It means making it useful and reliable for the job it has to do.
For a journalist, that may mean carefully checked quotes and clean speaker labels.
For a researcher, it may mean searchable participant responses, notes, highlights, and timestamps for later analysis.
For a creator or podcaster, it may mean finding clips, writing show notes, preparing captions, and checking the wording of important moments.
For a consultant or product team, it may mean turning a customer conversation into accurate notes that can be shared without losing the voice of the interviewee.
In all of these cases, the workflow matters: record with permission, preserve the source, transcribe the saved file, review against playback, clean the details, add working notes, summarize carefully, and export for the next step.
That is how to properly transcribe an interview. For related saved-recording guides, the professional recording workflows hub collects podcast, meeting, lecture, research, client call, and interview workflows.