AI May 23, 2023
DarkBERT: New AI Model Could Help Detect Dark Web Activity
A team of scientists from South Korea developed an AI model trained exclusively with data from the eerie deep recesses of the internet, the Dark Web. Dubbed DarkBERT, the model can be used to identify and flag cybersecurity threats, including ransomware and data leaks.
The researchers demonstrated that DarkBERT can also be used to crawl through multiple dark web forums and monitor them for any exchange of harmful content, according to a preliminary version of the study published on arXiv.
Unlike other chatbots like ChatGPT or Bard that are multi-purpose, the new AI is used to analyze and produce answers based on a specific dataset, per the Korea Advanced Institute of Science and Technology team, which worked with data intelligence organization S2W.
🎉 Exciting news! Our talented AI team researchers have just had their paper accepted at ACL 2023, a top conference in the field! 📚🤖
🎓DarkBERT: https://t.co/Qon8XLXAG5#AI #Research #ACL2023 #BERT #DarkBERT
— S2W (@S2W_Official) May 18, 2023
DarkBERT crawls dark parts of the internet
The dark web is a hidden part of the internet that is often used for illegal activities, such as drug trafficking, weapons sales, and human trafficking. The dark web is usually not indexed by search engines like Google and can only be accessed using special software, such as Tor.
Researchers leveraged the Tor network to help their large language model DarkBERT comb through vast amounts of raw data on the dark web. The data included material from sites like cryptocurrency, hacking, and porn. Over 1,000 pages of the dataset consisted of porn.
DarkBERT is created with this sort of training data, which was filtered for sensitive things like illicit images, victim organization name, and details of leaked user information. The AI is built upon the BERT framework developed by Google and later refined by Facebook into RoBERTa.
DarkBERT’s pretraining process and the evaluation scenarios.
According to the Korean researchers, DarkBERT can demonstrate whether or not the use of the dark web as a dataset would allow AI tools to understand the kind of language used in those environments better. It did, they said, better than Google’s or Facebook’s versions.
“Our evaluation results show that DarkBERT-based classification model outperforms that of known pretrained language models,” the researchers wrote in their paper.
“…our automated web crawler takes the approach of removing any non-text media and only stores raw text data. By doing so, we do not expose ourselves to any sensitive media that is potentially illegal,” added the study.
No public access
Despite its curious name, the team says DarkBERT could be used to detect websites that sell ransomware or leaked private data. It could also make it easier for security researchers and law enforcement to identify and track down criminals who operate on the dark web.
Also read: AI Can Now Turn Thoughts Into Video
DarkBert will not be made available to the public anytime soon because of the potentially dangerous nature of dark web materials. But researchers said those looking to use the AI model for academic purposes can request for access.
AI
Judge Orders All AI-Generated Research To Be Declared in Court
A Texas federal judge has ordered that AI-generated content should not be used to make arguments in court, and that such information must be declared and verified by a human.
Judge Brantley Starr’s ruling comes after one attorney, Steven Schwartz, last week allowed OpenAI’s ChatGPT to “supplement” his legal research by providing him with six cases and relevant precedent. All the cases were untrue and completely “hallucinated” by the chatbot.
Also read: ChatGPT’s Bogus Citations Land US Lawyer in Hot Water
The debacle received wide coverage, leaving Schwartz with “regrets.” Other lawyers who may have been contemplating trying the stunt now have to think twice, as Judge Starr has put an end to it.
Judge Starr also added a requirement that any attorney who appears in his courtroom declare that “no portion of the filing was drafted by generative artificial intelligence,” or if it was, that it was checked “by a human being.”
Judge Starr lays down the law
The eminent judge has set specific rules for his courtroom, just like other judges, and recently added the Mandatory Certification Regarding Generative Artificial Intelligence.
This states that: “All attorneys appearing before the Court must file on the docket a certificate attesting either that no portion of the filing was drafted by generative artificial intelligence (such as ChatGPT, Harvey.AI, or Google Bard) or that any language drafted by generative artificial intelligence was checked for accuracy, using print reporters or traditional legal databases, by a human being.”
A form for lawyers to sign is appended, noting that “quotations, citations, paraphrased assertions and legal analysis are all covered by this proscription.”
According to a report by TechCrunch, summary is one of AI’s strong suits and finding and summarizing precedent or previous cases is something advertised as potentially helpful in legal work. As such, this ruling may be a major spanner in the works for AI.
The certification requirement includes a pretty well-informed and convincing explanation of its necessity.
It states that: “These platforms are incredibly powerful and have many uses in the law: form divorces, discovery requests, suggested errors in documents, anticipated questions at oral argument.
“But legal briefing is not one of them. Here’s why.
“These platforms in their current states are prone to hallucinations and bias,” reads part of the certification.
It further explains that on hallucinations, AI is prone to simply making stuff up – even quotes and citations. While another issue relates to reliability or bias.
Chatbots don’t swear an oath
The certification further notes that although attorneys swear an oath to set aside their personal prejudices, biases, and beliefs to faithfully uphold the law and represent their clients, generative AI is the programming devised by humans who did not have to swear such an oath.
In the case of Schwartz, he said in an affidavit that he was “unaware of the possibility that its (ChatGPT) content could be false.”
He added that he “greatly regrets” using the generative AI and will only “supplement” its use with absolute caution and validation in future, further claiming he had never used ChatGPT prior to this case.
The other side of ChatGPT
Launched last November, ChatGPT is a large language model developed by OpenAI. The AI-powered chatbot is trained on billions of data sets from the internet and can perform a variety of tasks such as generating text and translating languages.
Despite going viral and provoking a fierce AI race, ChatGPT has its downsides – it can hallucinate and has misled Schwartz, who was representing Roberto Mata in a lawsuit against Colombian airline Avianca. Effectively, the chatbot provided citations to cases that did not exist.
Yet when Schwartz asked ChatGPT if one of the supposed cases was a real case, it responded “yes, (it) is a real case.” When asked for sources, the chatbot told Schwartz the case could be found “on legal research database such as Westlaw and LexisNexis.”
A lawyer used ChatGPT to do "legal research" and cited a number of nonexistent cases in a filing, and is now in a lot of trouble with the judge 🤣 pic.twitter.com/AJSE7Ts7W7
— Daniel Feldman (@d_feldman) May 27, 2023
The matter came to light after the opposing counsel flagged the ChatGPT-generated citations as fake.
US District Court Judge Kevin Castel confirmed six of them as non-existent and demanded an explanation from Schwartz.
“Six of the submitted cases appear to be bogus judicial decisions with bogus quotes and bogus internal citations,” wrote Judge Castel in a May 4 order.
AI
Nvidia Debuts AI Tools in an Era Where “Anyone Can Be a Programmer”
The world’s most valuable chip maker Nvidia has unveiled a new batch of AI-centric products, as the company rides on the generative AI wave where anyone can be a programmer.
Nvidia announced a new supercomputer and a networking system, while the company also aims to make video game characters more realistic.
The wide range of products include robotics design, gaming capabilities, advertising services, and networking technology, which CEO Jensen Huang unveiled during a two-hour presentation in Taiwan on Monday.
Also read: Google Claims its AI Computer Outperforms Nvidia’s A100 Chip
Most notable of the new products is the AI supercomputer platform named DGX GH200 that will help tech companies create successors to OpenAI’s ChatGPT.
According to the company, the new DGX GH200 supercomputers combine 256 GH200 superchips that can act as a single graphics processing unit (GPU). The result is a system that boasts nearly 500 times the memory of a single Nvidia’s DGX A100 system.
“Generative AI, large language models, and recommender systems are the digital engines of modern economy,” said Huang.
“DGX GH200 AI supercomputers integrate Nvidia’s most advanced accelerated computing and networking technologies to expand the frontier of AI.”
So far, Microsoft Corp., Meta Platforms Inc., and Alphabet’s Google are expected to be among the first users, according to Nvidia.
The DGX GH200 supercomputers are expected to be available by the end of 2023.
The GH200 superchips which power the new supercomputer work by combining Nvidia’s Arm-based Grace GPU and an Nvidia H100 Tensor Core GPU in a single package.
The chipmaker also revealed that it’s building its own supercomputer running four DGX 200 systems at the same time to power its own research.
Nvidia also released its ACE generative AI model for video games, enabling gaming companies to use generative AI for large games with multiple non-player characters, giving them unique lines of dialogue and ways to interact with players that would normally need to be individually programmed.
Easy ad content
Alongside the hardware announcement, the company said it has partnered with advertising giant WPP to create a content engine that uses its Omniverse technology and generative AI capabilities to help build out ad content.
The move is intended to cut down the time and cost of producing ads by enabling WPP’s clients to lean on Nvidia’s technology.
Electronics manufacturers such as Foxconn, Pegatron, and Wistron are using Omniverse technology to create digital twins of their factory floors, so they can get a sense of how best to lay them out before making any physical changes.
A new computing era
Presenting at the forum, Huang acknowledged that advancements in AI are ushering in a new era in computing. He says anyone can be a programmer simply by speaking to the computer.
According to the Nvidia boss, gone are the days when programmers would write lines of code, only for it to display the “fail to compile” response because of a missing semicolon.
“This computer doesn’t care how you program it, it will try to understand what you mean, because it has this incredible large language model capability. And so the programming barrier is incredibly low,” said Huang.
“We have closed the digital divide. Everyone is a programmer. Now, you just have to say something to the computer,” he added.
Huang said his company has managed to bridge the digital gap, and the tech giant will continue to capitalize on the AI frenzy that has made Nvidia one of the world’s most valuable chipmakers.
Nvidia’s stock price is rising
Nvidia’s major announcements came as shares of the tech giant jumped last week on news that the company anticipated second quarter revenue above Wall Street’s expectations, based on the strength of its data center business.
The company hit the $1 trillion market cap just before the US markets opened on Tuesday. Its shares are trading at $407 in the pre-market, nearly 5% up from Monday.
Nvidia’s shares were up more than 165% year-to-date as of Friday afternoon, with the S&P 500 (^GSPC) just 9.5% higher in the same frame.
Rival chip maker AMD has experienced a similar boost in share price, rising 93%. However, Intel (INTC) is lagging behind with shares up just 8%.
According to Yahoo Finance tech editor Daniel Howley, while analysts see Nividia well ahead of its chip rivals in the AI processing space, how long that continues to be the case is anyone’s guess.
AI
ChatGPT’s Bogus Citations Land US Lawyer in Hot Water
A lawyer in the United States is facing disciplinary action after his law firm used popular AI chatbot ChatGPT for legal research and cited fake cases in a lawsuit.
Steven A. Schwartz, who is representing Roberto Mata in a lawsuit against Colombian airline Avianca, admitted to using OpenAI’s ChatGPT for research purposes, and that the AI model provided him with citations to cases that did not exist.
Mata is suing Avianca for a personal injury caused by a serving cart in 2019, claiming negligence by an employee.
Also read: Opera Unveils GPT-Powered AI Chatbot Aria
Bogus all the way
According to a BBC report, the matter came to light after Schwartz, a lawyer with 30 years experience, used these cases as precedent to support Mata’s case.
But the opposing counsel flagged the ChatGPT-generated citations as fake. US District Court Judge Kevin Castel confirmed six of them as non-existent. He demanded an explanation from Schwartz, an attorney with New York-based law company Levidow, Levidow & Oberman.
“Six of the submitted cases appear to be bogus judicial decisions with bogus quotes and bogus internal citations,” Judge Castel wrote in a May 4 order.
“The court is presented with an unprecedented circumstance.”
The supposed cases include: Varghese v. China South Airlines, Martinez v. Delta Airlines, Shaboon v. EgyptAir, Petersen v. Iran Air, Miller v. United Airlines, and Estate of Durden v. KLM Royal Dutch Airlines, none of which did not appear to exist to either the judge or defense.
Lawyer claims ignorance
ChatGPT is a large language model developed by OpenAI. Launched in November, the AI is trained on billions of data from the Internet and can perform a variety of tasks like generate text, translate languages, and even write poetry, and solve difficult math problems.
But ChatGPT is prone to “hallucinations” – tech industry speak for when AI chatbots produce false or misleading information, often with confidence.
In an affidavit last week, Schwartz said he was “unaware of the possibility that its [ChatGPT] content could be false.” He also said that he “greatly regrets” using the generative AI and will only “supplement” its use with absolute caution and validation in future.
Schwartz claimed to have never used ChatGPT prior to this case. He said he “greatly regrets having utilized generative artificial intelligence to supplement the legal research performed herein and will never do so in the future without absolute verification of its authenticity.”
A lawyer used ChatGPT to do "legal research" and cited a number of nonexistent cases in a filing, and is now in a lot of trouble with the judge 🤣 pic.twitter.com/AJSE7Ts7W7
— Daniel Feldman (@d_feldman) May 27, 2023
The career attorney now faces a court hearing on June 8 after accepting responsibility for not confirming the authenticity of the ChatGPT sources. Schwartz was asked to show cause why he shouldn’t be sanctioned “for the use of a false and fraudulent notarization.”
ChatGPT’s confident lies
According to the BBC report, Schwartz’s affidavit contained screenshots of the attorney that confirmed his chats with ChatGPT.
Schwartz asked the chatbot, “is varghese a real case?”, to which ChatGPT responded “yes, [it] is a real case.” When asked for sources, it told the attorney that the case could be found “on legal research databases such as Westlaw and LexisNexis”.
Again, the attorney asked: “Are the other cases you provided fake?” ChatGPT responded “No”, adding that the cases could be found on other legal databases. “I apologize for the confusion earlier,” ChatGPT said.
“Upon double-checking, I found the case Varghese v. China Southern Airlines Co. Ltd., 925 F.3d 1339 (11th Cir. 2019), does indeed exist and can be found on legal research databases such as Westlaw and LexisNexis. I apologize for any inconvenience or confusion my earlier responses may have caused,” the chatbot replied with confidence.
-
BusinessThu 1 Jun 2023 06:36 GMT
Metaverse Gaming Market Expected to Reach $119.2 Billion by 2028
-
AIWed 31 May 2023 17:45 GMT
Judge Orders All AI-Generated Research To Be Declared in Court
-
CryptocurrenciesWed 31 May 2023 07:03 GMT
Floki Inu (FLOKI) Volumes Surge 300% on China Metaverse Game Plans
-
AITue 30 May 2023 15:07 GMT
Nvidia Debuts AI Tools in an Era Where “Anyone Can Be a Programmer”
-
BusinessTue 30 May 2023 10:43 GMT
Chinese City Pledges $1.42bn to Boost Metaverse Industry Growth
-
AITue 30 May 2023 06:40 GMT
ChatGPT’s Bogus Citations Land US Lawyer in Hot Water
-
AIMon 29 May 2023 20:30 GMT
Sandbox Founder Remains Bullish on Metaverse ‘Marathon of Many Sprints’
-
CryptocurrenciesMon 29 May 2023 17:00 GMT
Hong Kong Police Launch Metaverse Platform to Fight Cyber Crime