In this tutorial, we explore how we can decode linguistic features directly from brain signals using a modern neuroAI pipeline. We work with MEG data and…

MarkTechPost lagi ngeluarin cerita yang cukup penting: In this tutorial, we explore how we can decode linguistic features directly from brain signals using a modern neuroAI pipeline. We work with MEG data and build an end-to-end system that transforms raw neural activity into meaningful predictions, in this case, estimating word length from brain responses. We set up the…. Buat AI, ini biasanya bukan cuma soal model atau demo baru, tapi soal arah product strategy. Kalau lo ngikutin ai updates, cerita kayak gini sering jadi tanda bahwa batas antara “eksperimen” dan “alat kerja harian” makin tipis.

Kalau kita lihat lebih jauh, In this tutorial, we explore how we can decode linguistic features directly from brain signals using a modern neuroAI pipeline. We work with MEG data and build an end-to-end system that transforms raw neural activity into meaningful predictions, in this case, estimating word length from brain responses. We set up the environment, load and process neural events, design a custom feature extractor, and construct a structured data pipeline using NeuralSet. From there, we train a convolutional neural network to learn patterns in the temporal and spatial structure of MEG signals. Throughout the process, we focus on building a clean, modular workflow that mirrors real-world neuroAI research practices. Copy Code Copied Use a different Browser import subprocess, sys, importlib, pkgutil def pip_install(*pkgs): print(f"pip install {' '.join(pkgs)} ...") r = subprocess.run([sys.executable, "-m", "pip", "install", "-q", *pkgs], capture_output=True, text=True) if r.returncode != 0: print("pip STDOUT:", r.stdout[-2000:]) print("pip STDERR:", r.stderr[-2000:]) raise RuntimeError("pip install failed; see output above.") print(" ok") pip_install("numpy>=2.0, We install and validate all required dependencies, ensuring critical packages such as NumPy and NeuralSet are properly configured. We perform a quick NumPy check to avoid runtime issues later in the pipeline. We then import all core libraries needed for data processing, modeling, and visualization. Copy Code Copied Use a different Browser def deep_import(pkg_name: str): try: pkg = importlib.import_module(pkg_name) except Exception as e: print(f" could not import {pkg_name}: {e}") return if not hasattr(pkg, "__path__"): return for m in pkgutil.walk_packages(pkg.__path__, prefix=pkg_name + "."): try: importlib.import_module(m.name) except Exception: pass deep_import("neuralfetch") deep_import("neuralset") torch.manual_seed(0); np.random.seed(0) catalog = ns.Study.catalog() print(f"\n{len(catalog)} studies registered.") preferred = ["Fake2025Meg", "Test2025Meg", "Test2023Meg"] study_name = next((n for n in preferred if n in catalog), None) if study_name is None: meg_studies = [n for n, c in catalog.items() if "Meg" in c.neuro_types()] study_name = meg_studies[0] if meg_studies else None if study_name is None: raise RuntimeError( "No MEG study available. Catalog: " f"{sorted(catalog.keys())[:20]}… " "Install neuralfetch correctly (pip install neuralfetch) and re-run." ) print(f"→ Using study: {study_name}") We dynamically import all submodules from NeuralFetch and NeuralSet to ensure that all available studies are properly registered. We seed the random number generator for reproducibility and inspect the study catalog to identify available MEG datasets. We then select an appropriate study to use as the foundation for our pipeline. Copy Code Copied Use a different Browser class CharCount(ext_mod.BaseStatic): event_types: tp.Literal["Word"] = "Word" def get_static(self, event) -> torch.Tensor: return torch.tensor([float(len(event.text))], dtype=torch.float32) print("\nBuilding chain...") chain = ns.Chain(steps=[ {"name": study_name, "path": str(ns.CACHE_FOLDER)}, {"name": "QueryEvents", "query": "type in ['Word', 'Meg']"}, ]) events = chain.run() print(f" → {len(events)} events; types={sorted(events.type.unique().tolist())}") print(f" → Words: {(events.type=='Word').sum()} | " f"timelines: {events.timeline.nunique()}") print("\nSample words:") print(events[events.type=='Word'][["start","duration","text","timeline"]] .head(5).to_string(index=False)) print("\nBuilding segmenter...") segmenter = ns.dataloader.Segmenter( extractors={ "meg": {"name": "MegExtractor", "frequency": 100.0}, "char_count": CharCount(aggregation="trigger"), }, trigger_query="type == 'Word'", start=-0.2, duration=0.8, drop_incomplete=True, ) dataset = segmenter.apply(events) print(f" → SegmentDataset: {len(dataset)} segments") s0 = dataset[0] print(f"\nSingle item:\n meg : {tuple(s0.data['meg'].shape)}") print(f" char_count : {s0.data['char_count'].item()} " f"(word: {s0.segments[0].trigger.text!r})") We define a custom extractor that computes the character count of each word event, enabling us to create a supervised learning target. We build a processing chain to load and filter relevant events from the selected study. We then segment the MEG signals around word events and construct a dataset ready for modeling. Copy Code Copied Use a different Browser rng = np.random.RandomState(42) perm = rng.permutation(len(dataset)) n_tr, n_va = int(0.70*len(dataset)), int(0.15*len(dataset)) train_ds = dataset.select(perm[:n_tr]) val_ds = dataset.select(perm[n_tr:n_tr+n_va]) test_ds = dataset.select(perm[n_tr+n_va:]) print(f"\nSplit | train={len(train_ds)} val={len(val_ds)} test={len(test_ds)}") mk = lambda d, sh: DataLoader(d, batch_size=32, shuffle=sh, collate_fn=d.collate_fn, drop_last=False) train_loader, val_loader, test_loader = mk(train_ds, True), mk(val_ds, False), mk(test_ds, False) probe = next(iter(train_loader)) n_ch, n_t = probe.data["meg"].shape[-2:] print(f" → batch[meg] shape: {tuple(probe.data['meg'].shape)}") print(f" → batch[char] shape: {tuple(probe.data['char_count'].shape)}") class MEGDecoder(nn.Module): def __init__(self, n_channels: int, mid: int = 64): super().__init__() self.spatial = nn.Conv1d(n_channels, mid, 1) self.bn0 = nn.BatchNorm1d(mid) self.temporal1 = nn.Conv1d(mid, mid, 7, padding=3) self.bn1 = nn.BatchNorm1d(mid) self.temporal2 = nn.Conv1d(mid, mid//2, 7, padding=3) self.bn2 = nn.BatchNorm1d(mid//2) self.pool = nn.AdaptiveAvgPool1d(1) self.head = nn.Linear(mid//2, 1) self.drop = nn.Dropout(0.3) def forward(self, x): x = F.gelu(self.bn0(self.spatial(x))) x = F.gelu(self.bn1(self.temporal1(x))) x = self.drop(x) x = F.gelu(self.bn2(self.temporal2(x))) return self.head(self.pool(x).squeeze(-1)).squeeze(-1) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = MEGDecoder(n_channels=n_ch).to(device) print(f"\nDevice: {device} | params: {sum(p.numel() for p in model.parameters()):,}") train_targets = torch.cat([b.data["char_count"].squeeze(-1) for b in train_loader]) y_mean, y_std = train_targets.mean().item(), train_targets.std().item() + 1e-6 print(f"Target μ={y_mean:.2f} σ={y_std:.2f}") def prep(batch): x = batch.data["meg"].to(device).float() y = batch.data["char_count"].squeeze(-1).to(device).float() x = (x - x.mean(-1, keepdim=True)) / (x.std(-1, keepdim=True) + 1e-6) y = (y - y_mean) / y_std return x, y We split the dataset into training, validation, and test sets to ensure proper model evaluation. We create data loaders and inspect batch shapes to confirm correct data formatting. We then define a convolutional neural network and prepare normalized inputs and targets for stable training. Copy Code Copied Use a different Browser EPOCHS = 15 opt = torch.optim.AdamW(model.parameters(), lr=1e-3, weight_decay=1e-4) sched = torch.optim.lr_scheduler.CosineAnnealingLR(opt, T_max=EPOCHS) loss_fn = nn.MSELoss() hist = {"tr": [], "va": [], "r": []} def pearson(a, b): a, b = a - a.mean(), b - b.mean() return (a*b).sum() / (a.norm()*b.norm() + 1e-8) print("\n" + "="*64) print(f"{'Epoch':>5} | {'train':>9} | {'val':>9} | {'val_r':>7}") print("="*64) for ep in range(EPOCHS): model.train(); tr = [] for batch in train_loader: x, y = prep(batch) loss = loss_fn(model(x), y) opt.zero_grad(); loss.backward() torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0) opt.step(); tr.append(loss.item()) sched.step() model.eval(); va, P, T = [], [], [] with torch.no_grad(): for batch in val_loader: x, y = prep(batch); p = model(x) va.append(loss_fn(p, y).item()); P.append(p.cpu()); T.append(y.cpu()) P, T = torch.cat(P), torch.cat(T) r = pearson(P, T).item() hist["tr"].append(np.mean(tr)); hist["va"].append(np.mean(va)); hist["r"].append(r) print(f"{ep+1:>5d} | {np.mean(tr):>9.4f} | {np.mean(va):>9.4f} | {r:>+7.3f}") model.eval(); P, T = [], [] with torch.no_grad(): for batch in test_loader: x, y = prep(batch) P.append(model(x).cpu()); T.append(y.cpu()) P, T = torch.cat(P), torch.cat(T) test_r = pearson(P, T).item() test_mse = ((P - T) ** 2).mean().item() print(f"\nTEST | Pearson r = {test_r:+.3f} MSE = {test_mse:.3f}") print(f"(Synthetic-MEG signals are random by design — small/zero r is expected.)") fig, ax = plt.subplots(1, 3, figsize=(15, 4)) ax[0].plot(hist["tr"], label="train"); ax[0].plot(hist["va"], label="val") ax[0].set(xlabel="Epoch", ylabel="MSE", title="Loss curves"); ax[0].legend(); ax[0].grid(alpha=.3) ax[1].plot(hist["r"], color="C2"); ax[1].axhline(0, color="k", ls="--", alpha=.4) ax[1].set(xlabel="Epoch", ylabel="Pearson r", title="Validation correlation"); ax[1].grid(alpha=.3) m = float(max(T.abs().max(), P.abs().max())) ax[2].scatter(T.numpy(), P.numpy(), s=10, alpha=.35) ax[2].plot([-m, m], [-m, m], "k--", alpha=.4) ax[2].set(xlabel="True (z-scored char count)", ylabel="Predicted", title=f"Test predictions (r = {test_r:+.3f})"); ax[2].grid(alpha=.3) plt.tight_layout(); plt.show() print("\n Tutorial complete!") print(f" • Study used : {study_name}") print(f" • Pipeline : Chain → Segmenter → SegmentDataset → DataLoader") print(f" • Custom extractor : CharCount (subclass of BaseStatic)") print(f" • Built-in extractor: MegExtractor @ 100 Hz") print(f" • Model : 1×1 spatial conv + 2 temporal convs + linear head") We train the neural network using a structured training loop with loss tracking and learning rate scheduling. We evaluate the model on the validation and test sets using metrics such as MSE and Pearson’s correlation. Also, we visualize training performance and predictions to understand how well the model learns from the data. In conclusion, we demonstrated how we can bridge neural data and language understanding using deep learning. We implemented a full pipeline, from raw event extraction to model training and evaluation, while maintaining flexibility through reusable components like chains, segmenters, and extractors. Although we worked with synthetic MEG signals, the framework we built is directly applicable to real-world datasets and more complex decoding tasks. This exercise highlights how we can combine neuroscience, machine learning, and structured pipelines to advance interpretable brain decoding systems, laying a strong foundation for more advanced neuroAI applications. Check out the Full Codes and Notebook here . Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter . Wait! are you on telegram? now you can join us on telegram as well. Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.?  Connect with us The post A Coding Implementation of End-to-End Brain Decoding from MEG Signals Using NeuralSet and Deep Learning for Predicting Linguistic Features appeared first on MarkTechPost . ngasih petunjuk tentang apa yang lagi dicari pasar: speed, reliability, dan output yang bisa diukur. Di AI, yang menang bukan yang paling heboh ngomongin capability, tapi yang paling gampang dipakai tim buat nyelesaiin kerjaan nyata.

Research tambahan ngasih konteks yang lebih tajam: Research lookup returned no usable results.. Ini bikin pembacaan awal jadi lebih grounded, bukan cuma bergantung ke judul atau ringkasan feed. Kalau ada detail yang saling nambah, gue pakai itu buat bikin cerita ini lebih utuh dan lebih berguna buat lo.

Advertisement

Di level produk dan operasional, cerita kayak gini biasanya nunjukin satu hal: perusahaan yang lebih cepat belajar bakal punya advantage. Kalau workflow makin otomatis, tim yang masih manual kebanyakan bakal kalah gesit. Kalau distribusi makin ketat, brand yang punya channel kuat bakal lebih unggul. Jadi meskipun judulnya kelihatan khusus, implikasinya sering masuk ke area yang jauh lebih dekat ke keputusan bisnis sehari-hari daripada yang orang kira.

Ada juga layer kompetisi yang sering kelewat. Begitu satu pemain besar bergerak, pemain kecil biasanya punya dua pilihan: ikut naik level atau makin susah relevan. Itu sebabnya gue suka lihat berita bukan sebagai peristiwa tunggal, tapi sebagai bagian dari pola. Siapa yang bergerak duluan? Siapa yang nunggu? Siapa yang bisa mengeksekusi lebih rapi? Dari situ biasanya kebaca apakah sebuah tren masih hype atau udah mulai jadi infrastruktur.

Buat pembaca yang peduli ke hasil praktis, pertanyaan yang paling berguna bukan “apakah ini keren?” tapi “apa yang harus gue ubah setelah baca ini?”. Kalau lo founder, bisa jadi jawabannya ada di positioning, pricing, atau channel distribusi. Kalau lo trader, mungkin yang perlu dipantau adalah sentimen, momentum, dan apakah pasar udah overreact. Kalau lo cuma pengin update cepat, minimal lo jadi ngerti kenapa topik ini muncul dan kenapa orang lain mulai ngomongin sekarang.

Gue juga sengaja ngasih ruang buat konteks yang sedikit lebih tenang, karena berita yang rame sering bikin orang lompat ke kesimpulan terlalu cepat. Tidak semua headline berarti revolusi. Kadang ada yang cuma noise, kadang ada yang benar-benar awal perubahan. Bedanya ada di konsistensi tindak lanjutnya. Kalau dalam beberapa siklus berikutnya topik ini terus muncul, besar kemungkinan kita lagi lihat pergeseran yang serius, bukan sekadar buzz harian.

Jadi kalau lo minta versi pendeknya: A Coding Implementation of End-to-End Brain Decoding from MEG Signals Using NeuralSet and Deep Learning for Predicting Linguistic Features penting bukan karena judulnya doang, tapi karena dia nunjukin arah pergerakan yang bisa berdampak ke cara orang bikin produk, baca pasar, dan nyusun strategi. Buat gue, itu inti yang paling worth it untuk dibawa pulang. Sisanya bisa lo simpan sebagai detail, tapi arah besarnya udah cukup jelas: pergeseran ini layak dipantau, bukan di-skip.

AI Updates lagi bergerak cepat, jadi jangan cuma lihat headline.

MarkTechPost

Catatan redaksi

Kalau lo cuma ambil satu hal dari artikel ini

AI Updates update dari MarkTechPost.

Sumber asli

Artikel ini merupakan rewrite editorial dari laporan MarkTechPost.

Baca artikel asli di MarkTechPost
#AIUpdates#MarkTechPost#rss