Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Aalisha Dalal's picture
1 1

Aalisha Dalal

aalisha
DShah-11's profile picture
ยท

AI & ML interests

* Computer Vision * Deep Learning

Recent Activity

updated a dataset about 1 month ago
aalisha/srmdtranslations
published a dataset about 1 month ago
aalisha/srmdtranslations
reacted to reach-vb's post with ๐Ÿ‘€ over 1 year ago
Smol TTS models are here! OuteTTS-0.1-350M - Zero shot voice cloning, built on LLaMa architecture, CC-BY license! ๐Ÿ”ฅ > Pure language modeling approach to TTS > Zero-shot voice cloning > LLaMa architecture w/ Audio tokens (WavTokenizer) > BONUS: Works on-device w/ llama.cpp โšก Three-step approach to TTS: > Audio tokenization using WavTokenizer (75 tok per second) > CTC forced alignment for word-to-audio token mapping > Structured prompt creation w/ transcription, duration, audio tokens The model is extremely impressive for 350M parameters! Kudos to the OuteAI team on such a brilliant feat - I'd love to see this be applied on larger data and smarter backbones like SmolLM ๐Ÿค— Check out the models here: https://huggingface.co/collections/OuteAI/outetts-6728aa71a53a076e4ba4817c
View all activity

Organizations

Shrimad Rajchandra Mission Dharampur's profile picture SRMD's profile picture

models 0

None public yet

datasets 1

aalisha/srmdtranslations

Updated May 9 โ€ข 174
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs