변경 로그

Name: CyberWhisper
Author: CyberWhisper

Version 0.4.0

버그 수정, UI 개선 및 인프라 마이그레이션

전체 텍스트 검색 문제:

정확하고 신뢰할 수 있는 검색 결과를 보장하기 위해 전체 텍스트 검색 기능의 문제를 수정했습니다.

확장 화면 표시 문제:

확장 화면 구성을 사용할 때 표시 및 위치 문제를 수정했습니다. 여러 디스플레이 환경에서 창 동작 및 위치를 개선했습니다.

오류 메시지 표시:

오류 메시지 표시 및 형식 지정 문제를 수정했습니다. 오류 메시지 명확성과 사용자 피드백 메커니즘을 개선했습니다.

UI 스타일 정렬:

일관된 브랜드 경험을 위해 웹사이트 디자인과 일치하도록 애플리케이션 스타일을 업데이트했습니다. 모든 애플리케이션 인터페이스 전반에 걸쳐 시각적 일관성을 개선했습니다.

사전 및 스니펫 검색:

검색 사용성을 향상시키기 위해 어휘 및 스니펫에 대한 대소문자 구분 없는 검색을 구현했습니다. 사용자는 이제 대소문자와 관계없이 항목을 찾을 수 있습니다.

인프라 마이그레이션:

성능과 안정성을 향상시키기 위해 새로운 인프라로 마이그레이션했습니다.

Version 0.3.1

시스템 트레이 아이콘, 온보딩 및 사전 동기화

시스템 트레이 모드 아이콘:

빠른 시각적 식별을 위해 시스템 트레이 "모드 선택" 메뉴의 각 모드에 사전 설정 아이콘(이모지)을 추가했습니다. 모든 사전 설정 유형(voice_to_text, message, note, mail, vibe_coding, meeting, custom)에 대한 포괄적인 아이콘 매핑을 구현했습니다. 아이콘은 모드 목록에서 사용되는 사전 설정 아이콘과 일치하여 일관된 사용자 경험을 제공합니다.

향상된 온보딩 경험:

첫 번째 모드를 생성하기 위한 전용 단계를 온보딩 흐름에 추가했습니다. 키보드 단축키 설정을 온보딩 프로세스에 직접 통합했습니다. 첫 사용자 경험 중 필수 구성을 위한 단계별 가이드를 제공합니다.

사전 스니펫 서버 동기화:

사전 스니펫에 대한 서버 측 동기화를 구현했습니다. 스니펫은 이제 백업 및 크로스 디바이스 액세스를 위해 서버에 자동으로 동기화됩니다. 동기화는 사용자 워크플로우를 방해하지 않고 백그라운드에서 원활하게 작동합니다.

고급 설정 확장:

파워 사용자를 위한 더 세분화된 고급 설정을 추가했습니다. 추가 사용자 지정 옵션으로 설정 인터페이스를 확장했습니다. 더 쉬운 탐색을 위해 설정 분류 및 구성을 개선했습니다.

모드 목록 사용자 경험:

더 나은 시각적 표시기 및 상태 피드백으로 시각적 피드백을 개선했습니다. 더 직관적인 컨트롤과 더 나은 오류 처리로 모드 편집 인터페이스를 간소화했습니다. 테마에 적합한 아이콘 파일을 사용하여 다크 모드에서 CyberWhisper 공급자 아이콘의 가시성을 수정했습니다. 모든 공급자 아이콘이 라이트 및 다크 테마에서 올바르게 표시되도록 했습니다.

모드 편집 워크플로우:

더 나은 검증 및 사용자 피드백으로 모드 편집 워크플로우를 개선했습니다. 모드 관리 인터페이스 전반에 걸쳐 사전 설정 아이콘의 통합을 개선했습니다. 모드 구성 중 오류 처리 및 사용자 메시징을 강화했습니다.

Version 0.3.0

메시지 및 노트 사전 설정, BYOK LLM 및 설정 UI 리팩토링

메시지 및 노트 사전 설정:

전사를 간결한 채팅 메시지로 변환하고 톤 사용자 지정 옵션을 제공하는 메시지 사전 설정을 추가했습니다. 주요 포인트 추출과 함께 전사를 구조화된 노트로 요약하는 노트 사전 설정을 구현했습니다.

프론티어 LLM을 위한 Bring Your Own Key (BYOK):

프론티어 LLM 모델을 사용하기 위해 자체 API 키를 가져오는 지원을 구현했습니다. 다양한 공급자의 OpenAI 호환 API 엔드포인트 지원을 추가했습니다. 사용자 지정 LLM 공급자를 추가, 구성 및 관리하기 위한 포괄적인 공급자 관리 인터페이스. 연결 문제에 대한 자세한 오류 메시지가 포함된 자동 API 키 검증. 각 사용자 지정 공급자에 대한 유연한 모델 선택 및 구성. OpenAI, DeepSeek, Groq, Together AI, Perplexity 및 Longcat을 포함한 여러 LLM 공급자 지원.

설정 UI 리팩토링:

사용자 경험을 개선하기 위해 설정 인터페이스를 모달 창으로 리팩토링했습니다. 메인 인터페이스에서 벗어나지 않고 설정에 액세스하고 구성할 수 있습니다. 모달을 빠르게 닫기 위한 ESC 키 지원을 추가했습니다. 더 나은 구성 및 시각적 계층 구조로 설정 레이아웃을 강화했습니다. 전문적인 모달 프레젠테이션을 위한 부드러운 배경 흐림 및 페이드 애니메이션을 구현했습니다.

Version 0.2.2

누르기 딕테이션 모드, 핸즈프리 모드 및 하이브리드 단축키 관리자

누르기 딕테이션 모드:

누르면 녹음, 놓으면 전사: 키를 누르면 녹음을 시작하고 놓으면 전체 오디오를 전사하는 누르기 딕테이션 모드를 구현. 전체 오디오 전사: 놓을 때 녹음된 세그먼트 전체의 완전한 전사를 수행하여 의미론적 완전성과 컨텍스트 보존을 보장. 자동 텍스트 선택: 키를 놓을 때 선택된 텍스트를 자동으로 캡처하고 삽입하여 원활한 텍스트 교체 워크플로우를 가능하게 함. 최소 누름 지속 시간: 실수로 인한 활성화를 방지하기 위한 구성 가능한 최소 누름 지속 시간(기본값 150ms). 짧은 녹음의 오디오 패딩: 매우 짧은 오디오 세그먼트를 자동으로 패딩하여 정확한 전사를 보장하며, 전문 딕테이션 도구와 유사함.

핸즈프리 모드:

연속 듣기: 자동 세그멘테이션 및 실시간 전사가 있는 연속 오디오 모니터링을 활성화. 지능형 자동 세그멘테이션: 전사를 위해 음성을 의미 있는 청크로 세그먼트화하기 위해 500ms 침묵 기간을 자동으로 감지. 실시간 스트리밍 전사: 각 세그먼트가 완료되면 즉시 전사 결과를 제공하여 라이브 대화 캡처를 가능하게 함. 수정자 키 활성화: 추가 키 없이 핸즈프리 모드를 빠르게 활성화하기 위한 수정자 전용 키 조합(예: ⌘ + ⌥)을 지원.

하이브리드 단축키 관리자:

Fn 키 지원: 네이티브 CGEventTap 모니터링을 통해 Fn 키를 누르기 딕테이션 단축키로 사용하는 지원을 추가. 수정자 전용 조합: 추가 키를 요구하지 않고 핸즈프리 모드 활성화를 위한 수정자 전용 키 조합(예: ⌘ + ⌥)을 활성화.

모달 대화상자 상태 관리:

설정 모달 대화상자의 상태 관리를 개선하여 모든 구성 인터페이스에서 일관된 동작과 더 나은 사용자 경험을 보장.

Version 0.2.1

새 애플리케이션 아이콘, 스플래시 페이지 및 사전 기능

새 애플리케이션 아이콘:

시각적 아이덴티티 업데이트: 향상된 시각적 일관성을 가진 새로운 애플리케이션 아이콘 디자인을 구현. 브랜드 인식 향상: Dock, 메뉴 바 및 시스템 환경 설정을 포함한 모든 시스템 위치에서 아이콘을 업데이트.

스플래시 페이지 및 온보딩 투어:

첫 사용자 경험: 신규 사용자를 위한 스플래시 페이지와 포괄적인 온보딩 투어를 추가. 가이드 설정 프로세스: 주요 기능 및 기능성에 대한 단계별 소개. 사용자 온보딩 개선: 대화형 튜토리얼로 초기 사용자 경험을 향상.

사이드바의 사전 기능:

통합 사전 액세스: 어휘 및 스니펫 관리를 통합하는 사이드바에 사전 항목을 추가. 탭 인터페이스: 어휘와 스니펫 간의 쉬운 전환을 위해 사전 보기 내에 탭 인터페이스를 구현. 간소화된 탐색: 관련 기능을 그룹화하여 사이드바 탐색을 단순화.

플로팅 작업 버튼 (FAB) UI:

UI 전환: 향상된 접근성과 워크플로우를 위해 FAB 기반 사용자 인터페이스로 전환. 향상된 상호작용: FAB 디자인으로 사용자 상호작용 패턴을 개선.

고급 키보드 단축키 지원:

누르기 딕테이션용 Fn 키: Fn 키를 누르기 딕테이션 단축키로 사용하는 지원을 추가. 핸즈프리 모드용 수정자 키 조합: 핸즈프리 모드 활성화를 위해 수정자 전용 키 조합(예: ⌘ + ⌥) 지원을 구현. 유연한 단축키 구성: 단일 수정자 키와 복잡한 키 조합을 모두 지원하도록 단축키 시스템을 향상.

Version 0.2.0

실시간 전사, OOTB 음성 모델 및 성능 최적화

실시간 전사 지원:

HUD 패널에 실시간 전사 표시를 추가하여 생성되는 대로 실시간 전사 결과를 표시. 녹음 중 증분 전사 결과를 표시하는 스트리밍 전사 업데이트를 구현.

개봉 즉시 사용 가능 (OOTB) 음성 모델:

첫 실행 시 자동으로 사전 구성된 모델을 사용하는 기본 음성 모델 선택 시스템을 구현. 사용자는 모델 다운로드를 기다리지 않고 즉시 애플리케이션을 사용할 수 있어 원활한 첫 경험을 제공.

마이크 및 VAD 성능 최적화:

더 나은 리소스 관리 및 감소된 지연 시간으로 네이티브 오디오 캡처 성능을 개선. 더 빠른 전사 응답 시간을 위해 오디오 처리 지연 시간을 최소화.

사이드바 및 헤더 UI 최적화:

더 나은 시각적 계층 구조와 더 부드러운 접기/펼치기 애니메이션으로 사이드바 디자인을 개선. 빠른 액세스를 위해 통합된 마이크 장치 선택기 및 테마 전환기로 페이지 헤더를 향상. 애플리케이션 전체에서 더 나은 간격, 타이포그래피 및 시각적 일관성으로 UI 구성 요소를 개선.

Version 0.1.8

Audio Model Testing, HUD Enhancements & LLM Streaming

Audio Model Testing:

Added support to test audio models directly within the application. Users can now verify model performance and accuracy before using in production workflows.

HUD Panel Enhancements:

Added live audio waveform display in the HUD panel for visual feedback during recording. Implemented one-click copy functionality to quickly copy transcription content from the HUD panel. Enhanced HUD panel to show real-time transcription results directly in the panel interface.

LLM Streaming Support:

Added support to display LLM streaming responses in real-time. Users can now see LLM responses as they are generated, improving interaction feedback.

Manual Update Check:

Added manual update check functionality accessible from the sidebar. Users can now manually trigger update checks without waiting for automatic notifications.

Audio Device Detection Performance:

Improved performance and responsiveness of audio device detection. Reduced latency when scanning and listing available audio input devices. Optimized device detection to minimize system resource usage.

Microphone Settings Page Refactoring:

Refactored microphone settings page with better organization and user experience. Streamlined interface for selecting and configuring microphone devices. Improved visual design and information architecture for easier navigation.

Version 0.1.7

Google Sign-In & Always-On-Top HUD Panel

Google Account Sign-In Support:

Implemented Google OAuth authentication flow with secure code exchange for seamless account integration.

Always-On-Top HUD Panel:

Introduced a fully draggable HUD strip that can be repositioned anywhere on screen with elegant semi-transparent design that blends seamlessly with desktop content. The collapsible panel design features smooth expand/collapse animations, adjustable window sizes (Small, Medium, Large) for different use cases, and real-time opacity adjustment slider for customizing panel transparency. The non-activating design ensures the panel does not steal focus from other applications, maintaining workflow continuity.

HUD Light/Dark Theme Readability:

Comprehensive Light/Dark theme support for all HUD components ensuring optimal visibility and readability across different system themes.

Floating Action Button (FAB):

The Floating Action Button has been deprecated in favor of the new HUD panel. All FAB functionality has been integrated into the HUD panel with improved accessibility and features. The HUD panel provides a more native macOS experience with always-on-top capability and better visibility.

Version 0.1.6

Support Vocabulary & Recording Metadata Upgrades

Vocabulary & Misspelling Toolkit:

Customize domain-specific terminology lists and casing rules for precise transcriptions. Define common misspellings with automatic correction to reduce manual cleanup.

Recording History Metadata:

Capture and display sessionId, requestId, and the associated preset for faster troubleshooting. Include the new metadata fields in exports to support downstream analytics.

Version 0.1.5

User Profiles, Diagnostics & Header Experience

User Profile Management:

Introduced full profile display on the Profile page with name, gender, birth year, and profession details plus an editable form under Settings > Account.

User Avatar Enhancement:

Refined the avatar dropdown to highlight the full name with improved typography for clearer identity cues.

Profile Data Synchronization:

Added real-time loading and saving with consistent loading indicators and robust error handling.

Signed App Permission Validation:

Hardened notarized build entitlement checks with actionable error messaging and telemetry for signature failures.

Panic Hook Diagnostics:

Expanded the panic hook to capture structured stack traces and thread metadata while surfacing crash summaries and auto-restarting background workers.

Contextual Metadata Collection:

Gathered richer runtime context—including foreground app, OS build, and hardware model—to improve crash and feedback payload quality.

Header Layout Improvements:

Added a microphone selector and integrated theme switcher directly into the header for faster access.

Settings Page Refinements:

Added a birth-year dropdown, richer profession options, and unified loading indicators across Profile and Settings.

Version 0.1.4

Search Functionality & Data Synchronization Improvements

New Record Searchability:

Fixed issue where newly added voice records were not searchable due to missing user_id filtering in search queries.

User ID Synchronization:

Enhanced user_id parsing and storage from JWT access tokens to ensure proper record association.

Search Query Optimization:

Improved search logic to correctly handle records with NULL user_id values while maintaining backward compatibility.

Orphaned Record Cleanup:

Enhanced resync and reindex logic to automatically detect and remove orphaned database records that no longer have corresponding files.

Cross-User Cleanup:

Improved orphan cleanup to handle both authenticated and anonymous user records during synchronization.

User ID Recovery:

Added automatic user_id recovery for records that were incorrectly stored with NULL values by analyzing file system paths.

Path Management Refactoring:

Separated directory path retrieval from directory creation to prevent unintended directory creation during deletion operations.

Version 0.1.3

Enhanced User Experience & Advanced Search Capabilities

Processing Flow Visualization:

Added visual processing flow display when using modes, showing the complete pipeline from speech input to LLM processing or text output.

Enhanced Mode Editing:

Implemented comprehensive mode detail editing with click-to-edit functionality, allowing users to modify preset settings, voice models, LLM configurations, and advanced options.

Full-Text Search:

Implemented comprehensive full-text search across recording history, supporting search in transcriptions, titles, and LLM-generated content.

Performance Optimization:

Enhanced search performance with optimized query execution.

Advanced Filtering:

Added status-based filtering (completed, processing, error) with filter application.

History Re-indexing:

Added manual re-indexing functionality to rebuild search indexes and sync file system records with database.

Infinite Scroll:

Implemented seamless infinite scrolling for recording history with automatic pagination, reducing initial load time and improving user experience for large datasets.

Dock Icon:

Updated macOS Dock icon with new design and improved visual consistency.

System Tray Icon:

Enhanced system tray icon with better visibility and template support.

High-Resolution Assets:

Included @2x and @3x variants for Retina displays and various screen densities.

Version 0.1.2

Enhanced Infrastructure & System-wide Integration

Mirror Support:

Added alternative download mirrors for improved reliability and speed.

Faster Downloads:

Optimized download performance with multiple mirror sources.

Better Availability:

Reduced download failures with redundant mirror support.

Complete Model Catalog:

Full access to all CyberWhisper Cloud models through REST API.

Dynamic Model Loading:

Real-time model list fetching from CyberWhisper Cloud API.

Enhanced Model Selection:

Support for 6+ models including CyberWhisper Fast (ultra-fast response model for real-time conversations), CyberWhisper Flash (lightning-fast model for simple tasks), GPT-5 Nano (OpenAI's latest lightweight model with balanced performance), GPT-4o Mini (efficient OpenAI model for daily tasks), DeepSeek V3.1 (advanced reasoning capabilities with latest DeepSeek technology), and Gemini 2.5 Flash Lite (Google's ultra-fast lightweight model for real-time applications).

Mode Selection:

Quick mode switching directly from system tray.

Microphone Management:

Easy microphone device selection from tray menu.

System-wide Access:

Control CyberWhisper from anywhere in your system.

Quick Actions:

Essential functions accessible without opening the main window.

Version 0.1.1

Command Palette & Global Shortcuts

Smart Mode Search:

Search and activate modes using intelligent keyword matching.

Multi-criteria Filtering:

Find modes by preset names, voice models, LLM models, or descriptions.

Detailed Mode Information:

Display comprehensive mode details including preset, voice model, LLM model, input/output languages, and feature settings.

Global Shortcut Access:

Open Command Palette from anywhere with ⌘ + ⇧ + K.

Fuzzy Search:

Intelligent search that matches partial keywords and related terms.

Real-time Filtering:

Instant results as you type with live mode filtering.

Visual Mode Status:

Clear indication of active vs inactive modes with status badges.

Keyboard Navigation:

Full keyboard support with arrow keys and Enter to activate.

Added support for customizable global keyboard shortcuts.

Toggle recording with customizable shortcut (default: ⌥ + N).

Cancel Recording:

Cancel ongoing recording with Esc key.

Change Mode:

Quick mode switching with global shortcut (default: ⌘ + ⇧ + K). Shortcuts work system-wide, even when the app is not in focus. Fallback shortcut registration for better compatibility across different systems.

Version 0.1.0

Download Base Speech Models, Modes & Presets, and History Viewer

Download Base Speech Models

Support for downloading and running basic on-device speech-to-text models. This feature enables offline transcription capabilities and improved privacy for users who prefer to keep their audio data local.

Modes & Presets

Introduced Presets for Voice to Text and Message workflows, making it easier to configure transcription and usage modes. Users can now quickly switch between different processing modes without manual configuration.

History Viewer

Access and review your past transcriptions and interactions directly within the app. The History Viewer provides a comprehensive timeline of all your speech-to-text activities, making it easy to find and reference previous work.