mastodontech.de ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Offen für alle (über 16) und bereitgestellt von Markus'Blog

Serverstatistik:

1,5 Tsd.
aktive Profile

#PaperThread

0 Beiträge0 Beteiligte0 Beiträge heute

Interesting developments in subquadratic alternatives to self-attention based transformers for large sequence modeling (32k and more).

Hyena Hierarchy: Towards Larger Convolutional Language Models

arxiv.org/abs/2302.10866

They propose to replace the quadratic self-attention layers by an operator built with implicitly parametrized long kernel 1D convolutions.

#DeepLearning #LLMs #PaperThread

1/n

arXiv.orgHyena Hierarchy: Towards Larger Convolutional Language ModelsRecent advances in deep learning have relied heavily on the use of large Transformers due to their ability to learn at scale. However, the core building block of Transformers, the attention operator, exhibits quadratic cost in sequence length, limiting the amount of context accessible. Existing subquadratic methods based on low-rank and sparse approximations need to be combined with dense attention layers to match Transformers, indicating a gap in capability. In this work, we propose Hyena, a subquadratic drop-in replacement for attention constructed by interleaving implicitly parametrized long convolutions and data-controlled gating. In recall and reasoning tasks on sequences of thousands to hundreds of thousands of tokens, Hyena improves accuracy by more than 50 points over operators relying on state-spaces and other implicit and explicit methods, matching attention-based models. We set a new state-of-the-art for dense-attention-free architectures on language modeling in standard datasets (WikiText103 and The Pile), reaching Transformer quality with a 20% reduction in training compute required at sequence length 2K. Hyena operators are twice as fast as highly optimized attention at sequence length 8K, and 100x faster at sequence length 64K.

1/n
Our pre-print is finally out!
Here's my first #paperthread 🧵
In this work, co-authors and I clustered ischaemic stroke patients profiles, and recovered common patterns of cognitive, sensorimotor damage.

...Historically many focal lesions to specific cortical areas were associated with specific distinction, but most strokes involve subcortical regions and bring multivariate patterns of deficits.
To characterize those patterns, many studies have turned to correlation analysis, factor analysis, PCA, focusing on the relations among variables==domains of impairments...

medrxiv.org/content/10.1101/20

medRxiv · Behavior Clusters in Ischemic Stroke using NIHSSBACKGROUND Stroke is one of the leading causes of death and disability. The resulting behavioral deficits can be measured with clinical scales of motor, sensory, and cognitive impairment. The most common of such scales is the National Institutes of Health Stroke Scale, or NIHSS. Computerized tomography (CT) and magnetic resonance imaging (MRI) scans show predominantly subcortical or subcortical-cortical lesions, with pure cortical lesions occurring less frequently. While many experimental studies have correlated specific deficits (e.g. motor or language impairment) with stroke lesion locations, the mapping between symptoms and lesions is not straightforward in clinical practice. The advancement of machine learning and data science in recent years has shown unprecedented opportunities even in the biomedical domain. Nevertheless, their application to medicine is not simple, and the development of data driven methods to learn general mathematical models of diseases from healthcare data is still an unsolved challenge. METHODS In this paper we measure statistical similarities of stroke patients based on their NIHSS scores, and we aggregate symptoms profiles through two different unsupervised machine learning techniques: spectral clustering and affinity propagation. RESULTS We identify clusters of patients with largely overlapping, coherent lesions, based on the similarity of behavioral profiles. CONCLUSIONS Overall, we show that an unsupervised learning workflow, open source and transferable to other conditions, can identify coherent mathematical representations of stroke lesions based only on NIHSS data. ### Competing Interest Statement The authors have declared no competing interest. ### Funding Statement This work was supported by the Department of excellence 2018-2022 initiative of the Italian Ministry of education (MIUR) awarded to the Department of Neuroscience-University of Padua. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: For data of patients of the Saint Louis cohort: the Internal Review Board of Washington University School of Medicine (WUSM) gave ethical approval for this work. For data of patients of the Padua cohort: the Ethics Committee of the Azienda Ospedale Universit&agrave Padova (AOUP) gave ethical approval for this work. I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes Data can be made available upon reasonable request to Maurizio Corbettta at maurizio.corbetta{at}unipd.it. * AP : Affinity Propagation. GDM : General Distance Measure. GSM : General Similarity Measure. NIHSS : National Institutes of Health Stroke Scale. RSC : Repeated Spectral Clustering.

T'S PAPER DAY!

Seidel & Prinoth et al. has just been accepted for publication in A&A and you can find it on arXiv already today: arxiv.org/abs/2308.13622 ✨

Let us tell you a bit more 👇🏼🧵

@JuliaVSeidel

arXiv.orgDetection of atmospheric species and dynamics in the bloated hot Jupiter WASP-172~b with ESPRESSOThe population of strongly irradiated Jupiter-sized planets has no equivalent in the Solar System. It is characterised by strongly bloated atmospheres and atmospheric large-scale heights. Recent space-based observations of SO2 photochemistry demonstrated the knowledge that can be gained from detailed atmospheric studies of these unusual planets about Earth's uniqueness. Aims. Here we explore the atmosphere of WASP-172b a similar planet in temperature and bloating to the recently studied HD~149026~b. In this work, we characterise the atmospheric composition and subsequently the atmospheric dynamics of this prime target. Methods. We observed a particular transit of WASP-172b in front of its host star with ESO's ESPRESSO spectrograph and analysed the spectra obtained before during and after transit. Results. We detect the absorption of starlight by WASP-172b's atmosphere by sodium (5.6sigma), hydrogen (19.5sigma) and obtained a tentative detection of iron (4.1sigma). We detect strong - yet varying - blue shifts, relative to the planetary rest frame, of all of these absorption features. This allows for a preliminary study of the atmospheric dynamics of WASP-172b. Conclusions. With only one transit, we were able to detect a wide variety of species, clearly tracking different atmospheric layers with possible jets. WASP-172b is a prime follow-up target for a more in-depth characterisation both for ground and space-based observatories. If the detection of Fe is confirmed, this may suggest that radius inflation is an important determinant for the detectability of Fe in hot Jupiters, as several non-detections of Fe have been published for planets that are hotter but less inflated than WASP-172b.

Hello, world!

Look, it's me 👀 My second first-author paper has just been accepted for publication in A&A and you can find it on arXiv already today ✨

Let me tell you a bit about it 👇🏼🧵

(okay a lot, it's a big kid)

arxiv.org/abs/2308.04523

arXiv.orgTime-resolved transmission spectroscopy of the ultra-hot Jupiter WASP-189 bUltra-hot Jupiters are tidally locked with their host stars dividing their atmospheres into a hot dayside and a colder nightside. As the planet moves through transit, different regions of the atmosphere rotate into view revealing different chemical regimes. High-resolution spectrographs can observe asymmetries and velocity shifts, and offer the possibility for time-resolved spectroscopy. In this study, we search for other atoms and molecules in the planet`s transmission spectrum and investigate asymmetric signals. We analyse and combine eight transits of the ultra-hot Jupiter WASP-189 b taken with the HARPS, HARPS-N, ESPRESSO and MAROON-X high-resolution spectrographs. Using the cross-correlation technique, we search for neutral and ionised atoms, and oxides and compare the obtained signals to model predictions. We report significant detections for H, Na, Mg, Ca, Ca+, Ti, Ti+, TiO, V, Cr, Mn, Fe, Fe+, Ni, Sr, Sr+, and Ba+. Of these, Sr, Sr+, and Ba+ are detected for the first time in the transmission spectrum of WASP-189 b. In addition, we robustly confirm the detection of titanium oxide based on observations with HARPS and HARPS-N using the follow-up observations performed with MAROON-X and ESPRESSO. By fitting the orbital traces of the detected species by means of time-resolved spectroscopy using a Bayesian framework, we infer posterior distributions for orbital parameters as well as lineshapes. Our results indicate that different species must originate from different regions of the atmosphere to be able to explain the observed time dependence of the signals. Throughout the course of the transit, most signal strengths are expected to increase due to the larger atmospheric scale height at the hotter trailing terminator. For some species, however, the signals are instead observed to weaken due to ionisation for atoms and their ions, or the dissociation of molecules on the dayside.

Hot off the press! 🔥
doi.org/10.1016/j.neubiorev.20

We performed a meta-analysis of neuroimaging studies looking at interpersonal neural synchronization (INS) and contextualized the results using diverse public databases to develop new hypotheses on physiological processes potentially involved in INS.

What is he talking about, you ask? See below ⏬

1/9

#NewPaper #PaperThread #NeuroPaper #NeuroPaperThread #NewNeuroPaper #NeuroScience #Neuro #Psych #Psychiatry #Psychology #Cognition #Brain #Communication #Science #Research #DataViz #DataScience
@neuroscience @neuro @cognition @fmri @phdstudents @academicchatter

Fortgeführter Thread

3 main take aways:

(1) Scientific inference cannot be automated or proceduralized.

(2) Not heeding the warning in (1) will necessarily limit our ability for scientific discovery and understanding.

(3) The only way not to limit scientific discovery is to allow for unbounded pluralism.

/end of #PaperThread

What happens when you give Recurrent Neural Networks brain-inspired constraints of 3D spatial structure & neural communication during learning?

🧠🌐🤖

In our new project we show typical structural & functional #neuroscience motifs like modularity, small-worldness, functional clusters, mixed selectivity & efficiency emerge in these spatially-embedded RNNs

#Preprint
biorxiv.org/content/10.1101/20

#PaperThread / Summary
jachterberg.com/seRNN

#AI #neuralnetwork #NeuroAI #RNN #newpaper @neuroscience