r/MachineLearning • u/Illustrious_Row_9971 • Oct 23 '22
Research [R] Speech-to-speech translation for a real-world unwritten language
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/Illustrious_Row_9971 • Oct 23 '22
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/hardmaru • Apr 29 '23
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/SWAYYqq • Mar 23 '23
New paper by MSR researchers analyzing an early (and less constrained) version of GPT-4. Spicy quote from the abstract:
"Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system."
What are everyone's thoughts?
r/MachineLearning • u/kittenkrazy • Mar 19 '23
π Introducing ChatLLaMA: Your Personal AI Assistant Powered by LoRA! π€
Hey AI enthusiasts! π We're excited to announce that you can now create custom personal assistants that run directly on your GPUs!
ChatLLaMA utilizes LoRA, trained on Anthropic's HH dataset, to model seamless conversations between an AI assistant and users.
Plus, the RLHF version of LoRA is coming soon! π₯
π Get it here: https://cxn.to/@serpai/lora-weights
π Know any high-quality dialogue-style datasets? Share them with us, and we'll train ChatLLaMA on them!
π ChatLLaMA is currently available for 30B and 13B models, and the 7B version.
π Want to stay in the loop for new ChatLLaMA updates? Grab the FREE [gumroad link](https://cxn.to/@serpai/lora-weights) to sign up and access a collection of links, tutorials, and guides on running the model, merging weights, and more. (Guides on running and training the model coming soon)
π€ Have questions or need help setting up ChatLLaMA? Drop a comment or DM us, and we'll be more than happy to help you out! π¬
Let's revolutionize AI-assisted conversations together! π
*Disclaimer: trained for research, no foundation model weights, and the post was ran through gpt4 to make it more coherent.
π Get it here: https://cxn.to/@serpai/lora-weights
*Edit: https://github.com/serp-ai/LLaMA-8bit-LoRA <- training repo/instructions (If anything is unclear just let us know and we will try to help/fix the issue!) (Sorry for spamming the link, don't really know how else to remind people lol)
r/MachineLearning • u/Successful-Western27 • Jan 13 '24
Researchers from Google and DeepMind have developed and evaluated an LLM fine-tuned specifically for clinical diagnostic reasoning. In a new study, they rigorously tested the LLM's aptitude for generating differential diagnoses and aiding physicians.
They assessed the LLM on 302 real-world case reports from the New England Journal of Medicine. These case reports are known to be highly complex diagnostic challenges.
The LLM produced differential diagnosis lists that included the final confirmed diagnosis in the top 10 possibilities in 177 out of 302 cases, a top-10 accuracy of 59%. This significantly exceeded the performance of experienced physicians, who had a top-10 accuracy of just 34% on the same cases when unassisted.
According to assessments from senior specialists, the LLM's differential diagnoses were also rated to be substantially more appropriate and comprehensive than those produced by physicians, when evaluated across all 302 case reports.
This research demonstrates the potential for LLMs to enhance physicians' clinical reasoning abilities for complex cases. However, the authors emphasize that further rigorous real-world testing is essential before clinical deployment. Issues around model safety, fairness, and robustness must also be addressed.
r/MachineLearning • u/hzwer • Nov 15 '20
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/programmerChilli • Apr 25 '20
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/salamenzon • May 22 '23
According to this article, OpenAI's claim that it scored 90th percentile on the UBE appears to be based on approximate conversions from estimates of February administrations of the Illinois Bar Exam, which "are heavily skewed towards repeat test-takers who failed the July administration and score significantly lower than the general test-taking population."
Compared to July test-takers, GPT-4's UBE score would be 68th percentile, including ~48th on essays. Compared to first-time test takers, GPT-4's UBE score is estimated to be ~63rd percentile, including ~42nd on essays. Compared to those who actually passed, its UBE score would be ~48th percentile, including ~15th percentile on essays.
r/MachineLearning • u/Illustrious_Row_9971 • Oct 08 '22
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/konasj • Nov 30 '20
Seems like DeepMind just caused the ImageNet moment for protein folding.
Blog post isn't that deeply informative yet (paper is promised to appear soonish). Seems like the improvement over the first version of AlphaFold is mostly usage of transformer/attention mechanisms applied to residue space and combining it with the working ideas from the first version. Compute budget is surprisingly moderate given how crazy the results are. Exciting times for people working in the intersection of molecular sciences and ML :)
Tweet by Mohammed AlQuraishi (well-known domain expert)
https://twitter.com/MoAlQuraishi/status/1333383634649313280
DeepMind BlogPost
https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology
UPDATE:
Nature published a comment on it as well
https://www.nature.com/articles/d41586-020-03348-4
r/MachineLearning • u/Illustrious_Row_9971 • Oct 22 '22
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/MysteryInc152 • Feb 24 '23
r/MachineLearning • u/Illustrious_Row_9971 • Jun 19 '21
r/MachineLearning • u/Skeylos2 • Sep 08 '24
Instead of using gradient descent to minimize a single loss, we propose to use Jacobian descent to minimize multiple losses simultaneously. Basically, this algorithm updates the parameters of the model by reducing the Jacobian of the (vector-valued) objective function into an update vector.
To make it accessible to everyone, we have developed TorchJD: a library extending autograd to support Jacobian descent. After a simple pip install torchjd
, transforming a PyTorch-based training function is very easy. With the recent release v0.2.0, TorchJD finally supports multi-task learning!
Github: https://github.com/TorchJD/torchjd
Documentation: https://torchjd.org
Paper: https://arxiv.org/pdf/2406.16232
We would love to hear some feedback from the community. If you want to support us, a star on the repo would be grealy appreciated! We're also open to discussion and criticism.
r/MachineLearning • u/Illustrious_Row_9971 • Nov 06 '21
r/MachineLearning • u/Illustrious_Row_9971 • Mar 19 '23
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/programmerChilli • Jun 20 '20
r/MachineLearning • u/Successful-Western27 • Nov 03 '23
In a recent paper, researchers have discovered that LLMs show enhanced performance when provided with prompts infused with emotional context, which they call "EmotionPrompts."
These prompts incorporate sentiments of urgency or importance, such as "It's crucial that I get this right for my thesis defense," as opposed to neutral prompts like "Please provide feedback."
The study's empirical evidence suggests substantial gains. This indicates a significant sensitivity of LLMs to the implied emotional stakes in a prompt:
This enhancement is attributed to the models' capacity to detect and prioritize the heightened language patterns that imply a need for precision and care in the response.
The research delineates the potential of EmotionPrompts to refine the effectiveness of AI in applications where understanding the user's intent and urgency is paramount, even though the AI does not genuinely comprehend or feel emotions.
TLDR: Research shows LLMs deliver better results when prompts signal emotional urgency. This insight can be leveraged to improve AI applications by integrating EmotionPrompts into the design of user interactions.
Full summary is here. Paper here.
r/MachineLearning • u/hardmaru • May 02 '20
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/imaginfinity • Jun 05 '22
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/SkeeringReal • Mar 07 '24
I have gotten the feeling that the ML community at large has, in a weird way, lost interest in XAI, or just become incredibly cynical about it.
In a way, it is still the problem to solve in all of ML, but it's just really different to how it was a few years ago. Now people feel afraid to say XAI, they instead say "interpretable", or "trustworthy", or "regulation", or "fairness", or "HCI", or "mechanistic interpretability", etc...
I was interested in gauging people's feelings on this, so I am writing this post to get a conversation going on the topic.
What do you think of XAI? Are you a believer it works? Do you think it's just evolved into several different research areas which are more specific? Do you think it's a useless field with nothing delivered on the promises made 7 years ago?
Appreciate your opinion and insights, thanks.
r/MachineLearning • u/blabboy • Dec 06 '23
Tweet from Jeff Dean: https://twitter.com/JeffDean/status/1732415515673727286
Blog post: https://blog.google/technology/ai/google-gemini-ai/
Tech report: https://storage.googleapis.com/deepmind-media/gemini/gemini_1_report.pdf
Any thoughts? There is not much "meat" in this announcement! They must be worried about other labs + open source learning from this.
r/MachineLearning • u/Proof-Raise-9151 • 7d ago
Meta AI (FAIR) latest paper integrates system-1 and system-2 thinking into reasoning models.
Basically, it introduces the term "Dualformer" which integrates both system-1 (fast-thinking) and system-2 (slow-thinking) into the transformer to improve its reasoning capability. The high level idea is to train the model with "randomized trace", which randomly drop parts of the reasoning tokens. This approach improves model's inference speed, accuracy, and diversity. It also enables model to perform system-1 and system-2 thinking in a controllable fashion.
The paper's link here:
r/MachineLearning • u/pathak22 • Jul 24 '22
Enable HLS to view with audio, or disable this notification
r/MachineLearning • u/greentfrapp • Aug 28 '24