Deep learning software refers to a category of software tools and frameworks designed to facilitate the creation, training, and deployment of deep learning models. Deep learning is a subset of machine learning that involves training artificial neural networks with many layers (hence the term "deep") to learn representations of data. Deep learning software typically provides functionalities such as: * Neural network architecture design: Tools for designing and customizing the architecture of deep neural networks, including specifying the number of layers, types of layers (e.g., convolutional, recurrent), and connections between layers. * Data preprocessing and augmentation: Utilities for preparing and preprocessing input data for training deep learning models, including tasks such as normalization, data augmentation, and feature extraction. * Model training and optimization: Algorithms and techniques for training deep learning models on large datasets, including optimization algorithms like stochastic gradient descent, and methods for handling overfitting such as regularization and dropout. * Model evaluation and validation: Tools for evaluating the performance of trained models on validation and test datasets, including metrics such as accuracy, precision, recall, and F1-score. * Deployment and inference: Facilities for deploying trained deep learning models into production environments for inference on new data, often through integration with software development frameworks and platforms. Popular deep learning software frameworks include TensorFlow, PyTorch, Keras, and Caffe. These frameworks provide high-level abstractions and APIs that make it easier for developers and researchers to build and experiment with deep learning models without having to implement everything from scratch.
Submit New App
Picture to Text
picturetotext.info
The Picture to Text app converts images to editable text using OCR technology, supporting multiple languages and formats for easy text extraction from various sources.
Relu
relu.eu
Relu is a software company creating an AI software component to automatically convert 3D medical images into a Virtual Patient. We focus on making it easy to integrate this technology into your existing dental workflow/software.
VisionBot
visionbot.com
Visionbot.com is a scalable, easy to use service enabling field staff to collaborate more effectively leveraging AI for text and imagery. This leads to better event reporting and management, faster turnaround for project executions and vastly improves operational efficiency.
VXG
videoexpertsgroup.com
VXG is a global cloud video surveillance company that simplifies video management and makes systems scalable in a cost-effective way. Helping build customized, world-class video surveillance solutions for Systems Integrators, Security, Access Control, AI, Video Monitoring, Telecom and SaaS companies with over 150,000 cameras connected. The true, open cloud platform is designed for integration with other solutions or building new services that work with IP cameras. VXG is a future-proof, innovative technology platform and Cloud VMS engine for SaaS companies that is fully flexible and scalable, cost-effective, white-label and customizable. Delivering the fastest and easiest path to true cloud video surveillance, and providing a complete VMS with full source code and all the necessary components. The fully open (product agnostic) platform's key value lets customers deploy the solution in their own cloud/data center and integrate their in-house or 3rd party systems. Resulting in little effort from the customer's side and the fastest time to market. While empowering them with full control, branding and ownership over the product.
Wicket
wicketsoft.com
The Wicket facial authentication platform is a privacy-first, integrated solution that enables sensational event experiences for fans, guests, and employees with frictionless touchpoints that delight users and strengthen security for sports venues, live events, and credentialed facilities. Wicket's proprietary, privacy-first algorithms are built into our web-based platform and verify individuals in less than one second, making ingress and access management secure, frictionless, and convenient.
Jasper
jasper.ai
Jasper is an AI-powered content creation tool that generates consistent brand content for blogs, social media, and marketing, maintaining user-defined tones.
Krisp
krisp.ai
Krisp is an AI-powered app that cancels background noise during calls and meetings, provides real-time transcriptions, and offers customizable audio settings.
Notta
notta.ai
Notta is an AI transcription tool that converts voice conversations into text and offers features like summarization, translation, and integration with video platforms.
SpeechTexter
speechtexter.com
SpeechTexter is a free app that converts speech to text in real-time, supporting over 70 languages, suitable for note-taking and documentation.
Resemble.ai
resemble.ai
Resemble.ai creates custom AI-generated voices for diverse applications, offering voice cloning, multilingual support, and audio editing features.
Speechnotes
speechnotes.co
Speechnotes is a web-based app that converts speech to text for note-taking and transcription, using Google's speech recognition for accuracy.
Symbl.ai
symbl.ai
Symbl.ai is a conversation intelligence platform that provides real-time transcription and insights from unstructured conversation data using AI models.
Shownotes
shownotes.io
Shownotes is an AI-powered tool that automatically summarizes podcast episodes and creates a landing page with a full transcript and captions file. It uses chatGPT to convert YouTube automatic captions and generate a memorable quote, and it can also create a blog post from the transcript. Shownotes offers three plans: Free, Creator, and Pro. The Free plan provides one shownote per month, a summarized transcript, a landing page, and all shows are public. The Creator plan provides two shownotes per month, a summarized transcript, a landing page, the ability to make shows private, a landing page editor, a full transcript, and ums & ahs. The Pro plan provides unlimited shownotes, a summarized transcript, a landing page, the ability to make shows private, a landing page editor, a full transcript, ums & ahs, and a captions file.
AssemblyAI
assemblyai.com
AssemblyAI provides advanced speech-to-text and audio intelligence services for transcription, analysis, and insights from voice data.
Jammable
jammable.com
Jammable is an AI platform for creating music covers and voiceovers using a library of community-uploaded voice models.
Gladia
gladia.io
Gladia is a speech-to-text app that transcribes audio into written text accurately and efficiently in over 100 languages, supporting real-time processing and speaker identification.
PodcastAI
podcastai.com
PodcastAI is a platform that uses AI to assist with podcast production, offering features like transcription, audio enhancement, and content management.
Deepgram
deepgram.com
Deepgram provides an API for developers to access advanced speech AI for transcription, live audio processing, and contextual features in multiple languages.
OpenAI Platform
openai.com
The OpenAI Platform provides tools for text generation, summarization, and natural language processing using advanced AI models like GPT-3, GPT-4, and DALL-E.
Speechmatics
speechmatics.com
Speechmatics is the world’s leading expert in Speech Intelligence, combining the latest breakthroughs in AI and ML to unlock the business value in human speech. Businesses use Speechmatics worldwide to accurately understand and transcribe human-level speech into text regardless of demographic, age, gender, accent, dialect or location in real-time and on recorded media. Combining these transcripts with the latest AI-driven speech capabilities, businesses build products that utilize summaries, topics, sentiment, chapters, translation and more. Speechmatics processes over 300 years of transcription worldwide every month in 50 languages. Having pioneered machine learning in speech recognition, its neural networks consider acoustics, languages, dialects, multiple speakers, punctuation, capitalization, context and implicit meanings. Speechmatics is headquartered in Cambridge, UK with a New York office too. Speechmatics is a registered trademark.
Talkatoo
talkatoo.com
Talkatoo is reinventing dictation for medical professionals. Whether you're in the veterinary or human medical industry, Talkatoo is the speech to text software solution for you. Talkatoo is compatible on both Windows and Mac, works in any field that you can type (PIMs and EHR's included), and is very easy to use. * Talkatoo is a desktop dictation solution designed for clinical uses, with a focus on converting speech to text, including specialized vocabularies and medical terms. * Reviewers appreciate Talkatoo's ability to accurately convert speech into text, including complex medical terms, and its user-friendly interface that aids in increasing efficiency and productivity in creating medical records. * Reviewers noted that Talkatoo can be slow when processing a large number of instructions, has occasional difficulty in recognizing specific, less common terms, and its customer support response can be delayed.
Speechlogger
speechlogger.com
Speechlogger is a web-based app for real-time speech recognition, transcription, and translation, featuring auto-punctuation and text editing capabilities.
AI Voice Detector
aivoicedetector.com
AI Voice Detector is a voice verification tool that helps detect authenticity and filter out AI-generated voices. It offers users peace of mind and protection against audio manipulation, misinformation, voice scams, and plagiarism in oral assessments. * AI Voice Detector is a tool designed to distinguish between computer-generated voices and real human voices, specifically for business use cases, ensuring content authenticity and reliable reporting in customer service interactions. * Reviewers appreciate the software's implementation for protection against audio manipulation and voice scams, its ease of use, quick processing, and the ability to seamlessly process a wide range of audio file formats without any issues. * Users mentioned limitations such as the system requiring audio files to be at least 8 seconds long and free of background music, occasional misidentification of real voices as fake and vice versa, and limited software integration capabilities.
LumenVox
lumenvox.com
LumenVox is a leading provider of carrier-grade speech technology for organizations around the world. As part of Capacity, LumenVox transforms customer experiences with AI-driven speech recognition and voice authentication technology. LumenVox’s DNA is grounded in 20 years of voice technology and delivers the most comprehensive, cost-effective, and flexible speech offering. The company’s deep history in speech and voice technology enables companies to build voice experiences that not only understand what is being said, but also identify who is saying it. LumenVox is the only provider to give companies the flexibility and control they require to easily integrate applications in any environment – on-premise, multi-cloud or a hybrid model. In comparison to other speech providers, LumenVox can typically decrease the total cost of ownership (TCO) by as much as 35 percent. In addition, LumenVox can deploy new language models in an average of 60 days or less, where most providers require six months or more. ASR with Transcription is the cornerstone of the LumenVox software portfolio. LumenVox’s speech and voice software stack operates on a foundation of artificial intelligence and deep machine learning to deliver high performing future-proof speech technology. Powered by end-to-end deep neural networks, LumenVox’s ASR engine accelerates the ability to add new languages and dialects to serve a more diverse base of users. In conjunction with ASR, LumenVox offers Text-to-Speech (TTS) software to verbalize written text. This allows companies to turn chatbots into voicebots. Through LumenVox’s state-of-the-art toolset, companies can perform tuning and transcription–including parameter, grammar and version-upgrade testing–for any speech recognition application. The toolset helps customers avoid expensive, time-consuming professional services every time they need to augment their speech-enabled application. Customers who are on legacy ASRs can benefit from the toolset by having the ability to easily migrate their grammars and confidence values over to the LumenVox ASR.
ArtPro
artpro.com
ArtPro is an art inventory management software designed to help catalogue, archive, track, share and store artworks online.
Kukarella
kukarella.com
Make voice over with perfect audio clarity, pacing, inflection and pronunciation. On Kukarella you can try the best AI neural voices. All commercial rights are included. Kukarella offers access to over 800 AI voices in 130 languages and accents that are suitable for commercial use on any of our paid plans. In addition to voiceover, you can use Dialogues AI tool to create dialogues, or translate and dub your text into hundreds of languages with Simdubbing tool. And that's not all - you can transcribe all kinds of videos, audios, and YouTube videos, scrape text from webpages, and recognize text on images. Plus, Kukarella partners with some of the biggest names in tech, like Google, Amazon, Microsoft, and IBM, so you know you're getting the best. Lots of creative people from organizations like the Government of Canada, Salesforce, DHL, McDonald's, University of London, and Daimler-Mercedes use Kukarella for voiceovers and transcription, so you'll be in good company.
SpeechFlow
speechflow.io
SpeechFlow is a speech-to-text tool that provides accurate multilingual transcriptions for audio and video, with fast processing and tailored industry models.
Synth
usesynth.com
Synth is a comprehensive AI-powered solution for managing and leveraging business conversations. Synth transcribes, translates, and analyzes all your calls - be it sales calls, internal or external meetings, or call center calls and customer support interactions. Synth also provides automatic summaries of single or multiple calls. With its suite of advanced features like automated CRM data capture, multilingual transcription and translation, predictive analytics, and instantaneous insights delivered via Slack, Synth can your call data into actionable business strategies. Features: * Transcription and Translation: engage with international clients with transcription and translation services in over 50+ languages. * Automatic Call Summarization: Leverage Synth's ability to provide comprehensive summaries of single or multiple calls, turning extensive conversation data into concise, actionable points and automated reports and documents. * Automated CRM Synchronization: Keep your CRM updated with summaries, action items, and meeting details captured by Synth. * Real-Time Insights: Instantly obtain prospect information, company details, suggested questions, and call summaries via Slack. * Predictive Analytics: Harness data-driven insights on conversations likelihood and get tailored recommendations for your next steps. * Robust Security Compliance: Synth upholds security standards, Synth ensures the protection of your data and privacy. Use cases: * Power up Product Development: Capture and organize ideas with ease. Prioritizing Action Items; Summarize and Share Insights' * Streamline Marketing and Partnerships: Improve communication and collaboration with ease. Improve partnership meetings; Get everyone on the same page. * Streamline user research: Effortlessly capture and recall user insights. Understand users better; Summarize user feedback. * Make Data-Driven Investment Decisions: Effortlessly capture and recall key insights from pitch meetings and due diligence calls. Transcribe Pitch Meetings; Summarize Due Diligence Calls.
PromptSmart
promptsmart.com
PromptSmart is a teleprompter app that uses voice recognition to automatically adjust scrolling text, helping users deliver speeches and presentations smoothly.
VoxSciences
voxsci.com
VoxSciences converts your voicemails into text and delivers them to your mobile as a text (SMS) message and/or as an email.
Altered
altered.ai
Altered is a next-generation audio editor that integrates multiple Voice AI technologies into a user-friendly application for the production of high-quality voice content for various industries, including podcasters, video game studios, and eLearning.
Crescendo
crescendo.com
Crescendo Systems Corporation is a leading developer of Documentation, Digital Dictation, Voice Processing, Transcription and Workflow Management systems for the medical, legal, law enforcement and insurance sectors.
Philips SpeechLive
speechlive.com
Philips SpeechLive is a cloud-based dictation, transcription and speech recognition workflow solution. It helps authors go from speech to text quicker than ever before. SpeechLive has complete end-to-end encryption with Multi-Factor Authentication using Microsoft Azure cloud services. Our add-on speech recognition service has multilingual capabilities, real-time and deferred options, and voice command capability to format your document whilst you dictate.
Scribbl
scribbl.co
Transform your meeting experience with Scribbl – the ultimate AI-powered tool for enhancing productivity and collaboration. Say goodbye to the hassle of note-taking and embrace a new era of efficient meetings. Scribbl effortlessly captures, transcribes, and records your meetings, ensuring you never miss a beat. Our advanced AI breaks down each meeting into digestible topics and action items, streamlining the review process. With Scribbl's Chrome Extension, mark key moments in real-time, creating a seamless bridge between live discussions and post-meeting analysis. Sharing insights has never been easier. Whether it's with your team or external stakeholders, Scribbl's intuitive sharing features allow you to disseminate information swiftly and effectively.
ai|coustics
ai-coustics.com
ai|coustics is an AI tool that enhances speech audio quality by removing background noise and improving clarity for recordings and telecommunications.
Cochl
cochl.ai
Cochl is a research-based startup focusing on machine listening technology. We provide sound AI system for developers and businesses to empower their products and services to have the human-like listening ability.
CrystalSound
crystalsound.ai
CrystalSound is an desktop app using AI technology that helps to remove all unwanted noise and distractions during calls, recordings, and online meetings. With its advanced algorithms and state-of-the-art features, CrystalSound can eliminate background noise, echo, howling effects, and other voices, ensuring that you can communicate clearly and effectively. CrystalSound has the ability to work on Mac, Windows, Linux operating systems to meet the download and use needs of users. With CrystalSound, you no longer have to worry about compatibility issues with your communication app. Our solution is designed to work seamlessly with popular apps such as Teams, Zoom, Google Meet, Loom, Discord, and many more.
Dictalogic
dictalogic.com
Dictalogic provides specialized modules—including audio to text, speech to text, conversation to text, and task delegation—all through one dashboard. * Audio-only: Traditional audio dictation, in which the audio is recorded and sent to a transcriber, who can be located anywhere (including working from home). * Audio to text: Digital transformation enables voice-to-text conversion on the fly. In this approach, audio is recorded and sent to be transcribed, and the audio is converted to text before it reaches the transcriber. We provide multiple options on assignment for you to explore. * Speech to text: We also offer the ability for real-time speech to text. The workflow is the same as other dictation, which can be sent to any transcriber. * Conversation to text : Dictalogic Conversation module is a speech-to-text solution that combines speech recognition, speaker identification, and sentence attribution to each speaker (also known as diarisation) to provide real-time and/or asynchronous transcription of any conversation—all encapsulated in a secure portal accessible any time, 24/7.
Dubber
dubber.net
Dubber is the world’s Unified Cloud Call Recording & Voice AI solution for compliance and sales & service performance. Dubber’s fully compliant call recording solution can be switched on with a click, and is infinitely scalable in the Cloud - with no hardware required. Every call or conversation is captured automatically, stored securely in the Dubber Voice Intelligence Cloud, enriched with AI, and available instantly as a replay or insightful transcription, with real-time search, sentiment analysis, alerts & notifications.
Flipner AI
flipner.com
Flipner AI is an intelligent voice-to-text tool and content hub that turns audio snippets into ready-to-publish articles, serving as a quick assistant for writing. Flipner AI introduces a revolutionary approach to text creation, enabling writers to effortlessly capture and organize their myriad ideas anytime, anywhere. This innovative platform offers a unique content hub where both text and audio notes can be stored, facilitating the seamless transformation and amalgamation of thoughts into structured drafts or polished, ready-to-use documents through its user-friendly AI tool.
Jotengine
jotengine.com
Jotengine makes conversations and meetings more productive by turning them into audio transcription and video captioning.
Speech to Note
speechtonote.com
Speech to Note is an AI tool that converts spoken audio into editable text. It offers real-time transcription and organizational features for effective note-taking.
Spokestack
spokestack.io
Spokestack is a powerful platform of open source libraries and robust services to make your software fully voice-enabled including: * Automatic Speech Recognition * Voice Activity Detection * Wakeword * Text-to-speech * Custom Voice * Natural Language Understanding
Dictanote
dictanote.co
Dictanote is a notes app that uses speech-to-text technology for voice typing in over 50 languages, improving efficiency in note-taking during conversations or meetings.
Voxpow
voxpow.com
Voxpow converts speech to text for websites, allowing users to navigate and interact using voice commands in over 100 languages.
CueMe
cueme.com
CueME is the world's best billiards app to find people to play in person or virtually at any level of competition for singles, doubles, and tournaments. Play anyone anywhere from around the world with the CueME video, scoring, and ranking technology. As you play, you will win CueME chips with wins and accomplishments for recognition and prizes.
Datch
datch.io
Datch is a platform that leverages AI to capture highly detailed, structured human-centric data while surfacing asset insights for decision-making and resource management. Our goal is to cut deep into the availability shortfall by providing the data and intelligence needed to decrease asset MTTR, increase MTBF, support better planning and allow for faster decision making. In order to support the asset availability goals across resource management, reporting, planning, scheduling, and reliability, the product is designed around a single value proposition: ”perfect data”. By perfect data, we mean complete, highly accurate, context rich reports coming in from the frontline, and perfect recall and distillation of data to the right people at the right time. Data capture is achieved through a combination of worker enablement capabilities, such as speech-to-text, real-time translation, and conversational AI, and data enrichment, through features that add context and guidance to transform the data as it’s captured. Data accessibility and asset insights are tools that are underpinned by generative search trained on the company’s document management system, work management history, and other language-rich data sources related to assets.
Jupitrr
jupitrr.com
Jupitrr AI Video Maker is an AI-powered tool that allows creators to transform their voice recordings and podcasts into personalized videos. With this tool, users can easily create stunning video content in just minutes. The AI technology behind Jupitrr AI Video Maker automates the process of generating stock videos for creators' videos, including stock footage, charts, subtitles, and more. The tool boasts a user-friendly interface similar to editing a word document, eliminating the need for complex timelines and making video editing a breeze. It offers the convenience of one-click access to a vast library of stock videos, saving users the hassle of searching for the right footage. Jupitrr AI Video Maker supports multiple languages, including Spanish, Hindi, French, Mandarin, and many more, making it accessible to a wide range of creators around the world. In addition to stock videos, the tool also provides options for adding subtitles and captions in various sizes and styles. It even includes AI-generated captivating charts, designed to simplify the process of incorporating visual data into videos. Jupitrr AI Video Maker aims to empower creators by allowing them to focus on their creative vision instead of spending excessive effort on video editing. With its simplicity and versatility, Jupitrr AI Video Maker is a valuable tool for content creators looking to enhance their video production process.
Phonexia
phonexia.com
Phonexia is a voice biometrics software that verifies users' identities through voice patterns, enhancing security and efficiency in authentication processes.
Picovoice
picovoice.ai
Picovoice is a voice AI platform that enables developers to integrate custom voice recognition features into applications for various environments.
Recognosco
recognosco.com
AI-powered, speech recognition SDK leveraging Neural Network and Deep Learning technology. Built for partners. * Employing an in-direct approach - innovative technology without competing with our partners * Large market and language coverage across the globe * Flexible deployment: available on-premise or in the cloud * Mutually beneficial, long-term relationships * Fair and flexible commercial models * Product roadmap driven by partners * Ultimate partner experience - consultative, attentive, and approachable. Recognosco's speech-enabling platform provides specialised topics for healthcare and legal, allowing our partners to enrich their solutions with our speech recognition SDK, with minimal integration effort. Recognosco's AI-powered speech technology is used globally to enable professionals to maximise productivity and efficiency. Used in 25 countries with 10 languages, across 2000+ deployments with over 35 partners.
Recordator
recordator.com
Recordator.com is a quick and easy solution for anyone looking to record their calls with great recording quality. It works on any mobile device and carrier without requiring any setup.
SoundHound
soundhound.com
SoundHound is a voice AI platform enabling businesses to create conversational experiences, primarily in automotive and retail sectors.
SpeechAce
speechace.com
At SpeechAce, we are committed to helping language learners improve their speaking abilities through versatile speech recognition technology. We developed the world's first speech recognition API that not only helps language learners assess their speaking skills but also identify their exact areas of improvement. While the first version of our speech recognition API only provided a pronunciation score, we have now enhanced our offerings to include full speech transcription along with assessment of higher level skills such as vocabulary, grammar, fluency, coherence and relevance. SpeechAce boasts a diverse worldwide customer base which includes some of the smallest (but hottest) startups as well as some of the largest language learnings providers in the world.
SpeechWrite
speechwrite.com
SpeechWrite is a full solution provider specialising in workflow solutions, digital dictation, voice recognition and PDF solutions. SpeechWrite's practical technology, sophisticated yet simple, allows you to enhance your working environment and simply work smarter. Working closely with OEMs and technology partners, SpeechWrite have extensive knowledge of the latest technology developments and market trends. Established in 2001, SpeechWrite have over 100 collective years in the dictation industry and pride themselves on their speed to market and after-sale support.
Spellex
spellex.com
Spellex offers spell checking, dictation, and assistive technology software solutions by delivering innovative products and providing world-class service to Spellex's customers.
Thirdlane
thirdlane.com
Thirdlane Connect serves as a versatile customer communication and team collaboration application, offering your team a suite of features including chat, voice and video calls, conferencing, screen sharing, file sharing, and seamless integration with CRM and various other business applications. Facilitating multichannel customer communications and team collaboration, Thirdlane Connect is designed for both local and remote workers, supporting web browsers, iPhone, Android devices, as well as Windows, Linux, and Mac desktops. This powerful application is fully integrated with and powered by the Thirdlane Business Phone System or Thirdlane Multi Tenant PBX platforms. These platforms can be securely deployed in various settings, whether on premises or in private or public clouds, ensuring flexibility and security for your communication infrastructure.
Vatis Tech
vatis.tech
Revolutionising Speech Recognition with Superior Accuracy and Affordability. Vatis Tech’s API provides advanced speech-to-text technology that automatically converts audio or video files into text with over 95% accuracy, using proprietary deep-learning speech recognition algorithms. Vatis Tech offers its speech-to-text API engine and web platform to agile startups, behemoth enterprises, podcasters, journalists, and developers alike. This allows solution and service providers to integrate the technology into their applications, regardless of industry or use case. * Deploy on-prem or on cloud * Build in any programming language with our API * Get scalable GPU infra for training and inference * Contextual features like speaker diarization, entity detection, punctuation, and capitalization or numeral conversion. * Text editing features inside the web application * Transcribe in real-time or pre-recorded files
Voiceitt
voiceitt.com
Voiceitt is an app that helps people with non-standard speech communicate effectively using voice recognition technology, including voice-activated devices.
© 2025 WebCatalog, Inc.