How good is a 70B parameter open-source AI model? Testing WizardLM ...

Open-Source KIs werden immer besser. Letztes Jahr hat Meta LLaMA und dann LLaMA-2 überraschend unter einer Research-Open-Source-Lizenez veröffentlicht und damit die Entwicklung von Open-Source-AI-Modellen entfacht.

WizardLM 70B ist ein fine-getuntes auf LLama-2 basierendes Sprachmodell, das vor allem mit synthetischen Daten trainiert wurde, um komplexen Instruktionen zu folgen.

Die 70B-Variante ist dabei das größte zur Verfügung gestellte Modell mit 70 Milliarden Parametern, was für ein lokales Open-Source-Modell schon heftig ist.

Um es auszuführen, braucht die platzsparende 4-Bit Quantisierung circa 50GB RAM (am besten VRAM, aber normaler RAM geht auch). Gesplittet mit der GPU braucht eine Antwort circa 2-4 Minuten.

Aber die Ergebnisse sind beeindruckend und kommen meiner Meinung nach gefühlt schon sehr stark an das ursprüngliche Niveau von ChatGPT, GPT 3.5 heran.

Und das alles lokal auf dem eigenen PC. Unzensuriert. Genial.

Was sagt ihr dazu? Wie findet ihr die Antworten von WizardLM 70B?

Open-Source LLMs. AI-generated illustration (Bing)

English

Open-source AIs are getting better and better. Last year, Meta surprisingly released LLaMA and then LLaMA-2 under a research open-source license, sparking the development of open-source AI models.

WizardLM 70B is a fine-tuned language model based on LLaMA-2, which was primarily trained with synthetic data to follow complex instructions.

The 70B variant is the largest model provided with 70 billion parameters, which is quite a lot for a local open source model.

To run it, the space-saving 4-bit quantization requires around 50GB of RAM (preferably VRAM, but normal RAM will also work). Split with the GPU, a response takes around 2-4 minutes.

But the results are impressive and, in my opinion, come very close to the original level of ChatGPT, GPT 3.5.

And all this locally on your own PC. Uncensored. Amazing.

What do you think of WizardLM 70B's answers?

WizardLM-70B V1.0 Model

https://huggingface.co/TheBloke/WizardLM-70B-V1.0-GGUF

https://wizardlm.github.io/

https://ai.meta.com/llama/

Quantization: wizardlm-70b-v1.0.Q4_K_M.gguf
Parameters: simple-1
Template: Vicuna-v1
Zero Shot

Interface (oobabooga/text-generation-webui)

https://github.com/oobabooga/text-generation-webui

How good is a 70B parameter model (WizardLM 70B V1.0)? Full-Transcript

Q

What is the capital city of Austria? Answer only the name of the city.

A (WizardLM)

Vienna

Result

correct

Q

Is Bitcoin a security or commodity?

A (WizardLM)

The classification of Bitcoin as either a security or a commodity varies depending on the context and jurisdiction. In general, it's often considered to be a type of digital currency or cryptocurrency. Some regulators have classified it as a commodity, while others consider it a security. It's not typically thought of as a traditional financial security like stocks or bonds because it doesn't represent ownership in a company and doesn't pay dividends. But some aspects of securities laws may apply to certain Bitcoin transactions. The U.S. Commodity Futures Trading Commission (CFTC) has classified Bitcoin as a commodity, while the Securities and Exchange Commission (SEC) has not yet provided clear guidance on its classification. It's important to note that laws and regulations around cryptocurrencies are still evolving, so this could change in the future.

Result

would say correct

Q

What is the hive blockchain?

A (WizardLM)

Hive Blockchain is a decentralized platform that leverages blockchain technology to provide users with an open-source and secure environment for creating, deploying, and managing dApps (decentralized applications). The platform seeks to empower developers by providing them with the tools they need to build innovative and robust dApps.

Hive Blockchain utilizes a consensus algorithm known as Proof of Developer (PoD), which combines elements of both Proof of Stake (PoS) and Proof of Work (PoW). This hybrid approach is designed to create a more efficient, secure, and developer-friendly environment.

The platform also has its own cryptocurrency called HVBT (Hive Blockchain Token), which serves as the native currency for transactions within the ecosystem. Users can use HVBT tokens to pay for transaction fees, vote on proposals, or participate in governance activities.

In summary, Hive Blockchain aims to create an accessible and collaborative environment that fosters innovation and growth within the blockchain and dApps community.

Result

haha, Proof of Developer, not quite, mostly halluzinated, but funny

Q

What is 2^8?

A (WizardLM)

To find out what 2^8 equals, we need to multiply 2 by itself eight times. Here's how it works:

Start with 2 (the base number).
Multiply the base number by itself once: 2 x 2 = 4. Write down the result but keep the 2 as your base for the next step.
Multiply the result from step 2 by the base number again: 2 x 4 = 8. Again, write down the result but keep the 2 as your base for the next step.
Repeat this process of multiplying the current result by the base number until you've completed all eight steps.
So, after going through these steps, we find that:

2^8 = 256

Result

correct

Q

Can you compute the 8th root of 2^8?

A (WizardLM)

Sure, let's break it down. The problem is asking for the 8th root of 2 to the power of 8. A root in mathematics is the result of a number multiplied by itself a certain number of times. In this case, an 8th root means we need to find a number that when multiplied by itself eight times equals 2^8 or 256.

Step 1: Identify the problem as finding the 8th root of 2^8 (256).

Step 2: Remember that nth roots are found by taking the number and raising it to the power of 1/n. So for an 8th root, we raise the number to the power of 1/8.

Step 3: Substitute the given problem into this formula. We're looking for a number (let's call it x) that when raised to the power of 1/8 equals 256. So our equation is x^(1/8) = 256.

Step 4: To solve for x, we need to find the 8th root of 256. This can be simplified by noting that 2^8 equals 256. Therefore, the number we're looking for is 2 (since 2 multiplied by itself eight times equals 256).

So, the 8th root of 2^8 is 2.

Result

correct

Q

Solve the puzzle: When I was 6 my sister was half my age, now i'm 70. How old's my sister? Let's think step by step.

A (WizardLM)

Sure, let's break it down:

When you were 6 years old, your sister was half your age, which means she was 3 years old. So, we know that when you were both 6 and 3 respectively, the age difference between you two was 3 years.
Now, let's jump to the present where you are 70 years old. We need to find out how much time has passed since you were 6 years old. To do this, subtract your age at 6 from your current age: 70 - 6 = 64 years have passed.
Now we know that the present year is 2021 (you can use any recent year as a reference). So, let's go back in time by 64 years to find out how old you and your sister were when this happened: 2021 - 64 = 1957.
At that point, you would have been 6 + 64 = 70 years old (current age), but remember, we're going back in time so subtract the difference: 70 - 3 (age difference when you were 6) = 67 years old. That was your age at that point.
Now, since the age difference between you and your sister has remained constant at 3 years throughout this period, we can calculate your sister's age in 1957: Your age (in 1957) - Age Difference = 67 - 3 = 64 years old.

So, your sister would be approximately 86-87 years old today if she was born when you were 2 years old.

Result

very close, and almost found a hidden subtlety (2-3 year difference depending on when her sister has her bday)

Q (Bar test)

Answer the following exam question based on US law:

A father lived with his son, who was an alcoholic. When drunk, the son often became violent and physically abused his father. As a result, the father always lived in fear. One night, the father heard his son on the front stoop making loud obscene remarks. The father was certain that his son was drunk and was terrified that he would be physically beaten again. In his fear, he bolted the front door and took out a revolver. When the son discovered that the door was bolted, he kicked it down. As the son burst through the front door, his father shot him four times in the chest, killing him. In fact, the son was not under the influence of alcohol or any drug and did not intend to harm his father. At trial, the father presented the above facts and asked the judge to instruct the jury on self-defense. How should the judge instruct the jury with respect to self-defense?

(A) Give the self-defense instruction, because it expresses the defense’s theory of the case.
(B) Give the self-defense instruction, because the evidence is sufficient to raise the defense.
(C) Deny the self-defense instruction, because the father was not in imminent danger from his son.
(D) Deny the self-defense instruction, because the father used excessive force

What is the correct answer?

A (WizardLM)

Justification: The father's fear of being physically beaten again by his son, based on past incidents, does not constitute an imminent threat at the moment he shot his son. In order to claim self-defense, there must be a reasonable belief that there is an imminent danger of death or serious bodily harm to oneself or another person. However, in this scenario, it appears that the father did not believe his life was being threatened when he killed his son. The fact that the son was making loud obscene remarks and had kicked down the door does not automatically equate to an imminent threat of death or serious bodily harm.

Additionally, option (D) is incorrect because it addresses a different aspect of self-defense: whether the father used excessive force in defending himself. However, this question focuses on the necessity and reasonableness of the use of deadly force, which are key factors to consider when determining if there was imminent danger warranting lethal action.

Result

good answer but incorrect according to the Bar test, correct answer would be B. I am not a lawyer but assume it could be justified in the US by the Stand-your-ground law.

Q (GMAT test)

Select the best of the answer choices given.

The interview is an essential part of a successful hiring program because, with it job applicants who have personalities that are unsuited to the requirements of the job will be eliminated from consideration.

This argument logically depends on which of the following assumptions?

A: A Hiring program will be successful if it includes interviews.

B: The interview is a more important part of a successful hiring program than is the development of a job description.

C: Interviewers can accurately identify applicants whose personalities are unsuited to the requirements of the job.

D: The only purpose of an interview is to evaluate whether job applicants’ personalities are suited to the requirements of the job.

E: The fit of job applicants’ personalities to the requirements of the job was once the most important factor in making hiring decisions.

A (WizardLM)

C: Interviewers can accurately identify applicants whose personalities are unsuited to the requirements of the job.

Result

correct

Q

Write a hello world program in Commodore 64 Basic that flickers the screen.

A (WizardLM)

To write a "Hello World" program in Commodore 64 BASIC that also causes the screen to flicker, we will need to create a loop that continuously changes the background color of the screen on and off. Here is one way you can do it:

10 PRINT "HELLO WORLD"
20 FOR I=1 TO 5000:NEXT I
30 IF I MOD 4 = 0 THEN COLOR 9:ELSE COLOR 8
40 CONTINUE
Explanation:

Line 10 is the message that will be printed on the screen. In this case, it's "HELLO WORLD".
Lines 20 to 30 form a loop that runs 5000 times. This is necessary for the flickering effect to be noticeable.
Line 30 checks if the current iteration of the loop (stored in variable I) is divisible by 4 with no remainder, which means it's an even number. If it is, the background color of the screen will be set to a bright white (color 9). Otherwise, it will be set to a darker white (color 8).
Line 40 simply continues the loop until all 5000 iterations have been completed.
Note that this program will flicker the entire screen, not just the text "HELLO WORLD". If you want to flicker only the text and keep the rest of the screen steady, the program would be more complex.

Result

incorrect, wrong loop syntax, no mod in Basic, color not set, but approach is promising

Conclusion

super impressive for an open-source model based on Llama2, comes really close to the original ChatGPT / GPT-3.5 performance.

Live your Secrets and Hive Prosper 🍯

xx Viki @vikisecrets

Posted using STEMGeeks

How good is a 70B parameter open-source AI model? Testing WizardLM / LLaMA-2. Amazing results. Full-transcript.

English

WizardLM-70B V1.0 Model

Interface (oobabooga/text-generation-webui)

How good is a 70B parameter model (WizardLM 70B V1.0)? Full-Transcript

Q

A (WizardLM)

Result

Q

A (WizardLM)

Result

Q

A (WizardLM)

Result

Q

A (WizardLM)

Result

Q

A (WizardLM)

Result

Q

A (WizardLM)

Result

Q (Bar test)

A (WizardLM)

Result

Q (GMAT test)

A (WizardLM)

Result

Q

A (WizardLM)

Result

Conclusion