Questions tagged [scala]
Scala is a general-purpose programming language principally targeting the Java Virtual Machine. Designed to express common programming patterns in a concise, elegant, and type-safe way, it fuses both imperative and functional programming styles. Its key features are: an advanced static type system ...
82,199 questions
-1
votes
1answer
9 views
Scala-Groupby Array[String,Arrray[String]] by the String
I have a structure, Array[String,Arrray[String]]
It contains similar Strings
eg:
"A",["b","bc","f","df"]
"B",["b","df","sef","g"]
"A",["s","rg","rg"]
"B",["f","dfv","x"]
I want it ...
0
votes
0answers
13 views
How does slick profile work with respect to slick DB
I am having trouble to understand how to use slick profile.
My problem:
I am trying to use Slick with Akka-stream via the Alpakka JDBC plugin. The example given online is as follows:
#Load using ...
1
vote
2answers
127 views
Play-Slick Repository Pattern
I am trying to implement a simple Repository in a PlayFramework app with the play-slick plugin.
I am new to scala, new to slick, new to play.
When trying to compile my code, i get the following ...
0
votes
0answers
8 views
How to employ type member constraints for use with type classes?
I have a function with signature currently like so:
def runInSystem[J](job: J)
(implicit ev: Job[J] {type M <: SysJobMetaData},
asys: ActorSystem, jobContextService: JobContextService
): ...
0
votes
0answers
5 views
Scala Cats documentation: Monoid[Pair[A,B]]
I've been away from Scala for several years (and never really got comfortable with it to begin with), instead working with languages that were (for me) easier to learn and work with.
I am now coming ...
0
votes
1answer
18 views
How to sustrack values when keys are the same in pairRDDs?
I have two pairRDDs (Int, BreezeDenseMatrix[Double]) and what i want is, when the keys are the same to substrack their values.
E.g. when i have
RDD_1 : (1, BreezeMatrix_a)
RDD_2: (1, ...
2
votes
1answer
34 views
schedule jobs in programatic way prevent duplicate jobs
My sensor data is captured in hive tables and I want run spark jobs on those in regular intervals of time. Lets say 15 mins 30 mins 45 minute jobs.
We are using cron scheduler to schedule the jobs(...
3
votes
2answers
30 views
Is there a good way to join a stream in spark with a changing table?
Our Spark environment:
DataBricks 4.2 (includes Apache Spark 2.3.1, Scala 2.11)
What we try to achieve:
We want to enrich streaming data with some reference data, which is updated regularly. The ...
0
votes
2answers
22 views
Spark 2.3.1 on YARN : how to monitor stages progress programatically?
I have a setup with Spark running on YARN, and my goal is to programmatically get updates of the progress of a Spark job by its application id.
My first idea was to parse HTML output of the YARN GUI. ...
0
votes
0answers
7 views
Concurrent use of akka-http client connection pools
When using akka-http's Http().cachedHostConnectionPool concurrently I get all sorts of weird errors like:
Exception in thread "main" java.lang.IllegalStateException: Substream Source cannot be ...
4
votes
1answer
42 views
Attempting to generate types from other types in Scala
I'm using Slick on a project and for that I require a Slick representation of my rows and then also an in memory representation. I'm going to use a much simpler example here for brevity. Say for ...
0
votes
2answers
42 views
Match a word but not its inverse using [^] syntax
I am trying to make a regex that doesn't match one word, but does match its reverse. For example, if the word I don't want to match is "no":
I am matching this word // will pass
I am matching no ...
0
votes
1answer
52 views
Scala:How to convert my input to list of list
i have the below input,
input
[level:1,firstFile:one,secondFile:secone,Flag:NA][level:1,firstFile:two,secondFile:sectwo,Flag:NA][level:2,firstFile:three,secondFile:secthree,Flag:NA]
getting below ...
0
votes
0answers
14 views
Scala Spark : java.util.NoSuchElementException: key not found: -1.0
I am running an RF model on spark, [https://spark.apache.org/docs/2.0.0/ml-classification-regression.html#random-forest-classifier]
My issue is that if I load 2 different dataframes for train and ...
0
votes
0answers
10 views
Anyway to log information from a UDF in Databricks-spark?
I am creating some scala UDFs to process data in a Dataframe, and am wondering if it is possible to log information within a UDF? Any examples on how to do this?
0
votes
1answer
14 views
R tm package and Spark/python give different vocabulary size for Document Term Frequency task
I have a csv with a single column, each row is a text document. All text has been normalized:
all lowercase
no punctuation
no numbers
no more than one whitespace between words
no tags(xml, html)
I ...
0
votes
0answers
19 views
How to use Scala enums for JOOQ converters
I found in documentation how to force type from value to Java enum by EnumConverter.
I tried to make my own converter for scalas enum. Enum:
package com.enums
object BlockSize extends Enumeration {
...
2
votes
1answer
44 views
Why is Some(null) converted to None when mapping DataFrame to a case class
In REPL, supplying a Some(null) argument to a case class constructor yields no surprise, the value is preserved and assigned to the field of the case class:
scala> case class ...
0
votes
1answer
24 views
How do you parse string from arraybuffer to double using Scala?
I'm trying to map string to double from an ArrayBuffer that I parsed through Playframework but I keep getting the following error:
Exception in thread "main" java.lang.NumberFormatException: For ...
0
votes
0answers
13 views
How `shapeless.Cached` works?
Running the following code
import shapeless._
final case class Foo(s: String) { println("HELLO") }
object TestApp extends App {
implicit def foo(implicit s: String): Foo = Foo(s)
implicit val ...
0
votes
0answers
14 views
Hive SQL vs Spark APIs — performance difference in separate queries vs large one
I know that SQL statements are basically broken down to the same Spark calls that the API uses, e.g. "select * from my_data_frame1 inner join my_data_frame2 on id1 = id2" works out to my_data_frame1....
0
votes
1answer
25 views
String interpolation in scala not allowing selecting string from array
So here is my code block
val cols = df.columns
val w = cols(0)
val query1 = s"select $cols(0), square(age) as age, age as age2, first_name, last_name from test"
val query2 = s"select $w, square(age)...
4
votes
1answer
65 views
Scala Type Inference Confusion: Any or Nothing?
Original Version:
trait Animal[F[_], A]
case class Cat[F[_], I, A](limits: F[I], f: I => A) extends Animal[F,A]
object ConfuseMe {
def confuse[F[_], A](tt: Animal[F, A]) = tt match {
...
0
votes
1answer
135 views
PredictionIO IntelliJ setup. Missing module SDK
I'm trying to setup IntelliJ Idea for PredictionIO engine development in Scala. I'm following the documentation step by step. However, I am unable to build the project due to missing SDK.
I have JDK ...
1
vote
1answer
42 views
Scala: how to manage fatal errors (like OOM) in Future
We have some strange behaviors in our distributed apps. We didn't yet found why but we think it might be related to OutOfMemory errors.
However we try to follow good coding practice regarding fatal ...
1
vote
0answers
34 views
Scala:not enough arguments for method map
object test {
def main(args: Array[String]): Unit = {
val df = tdw.table("video",Seq("hello")).map(record=>{
val kv = deal_json(record(3).toString)
kv
})
df.first()
...
0
votes
2answers
83 views
Akka-HTTP: How to generate a route from a List?
Given a List("segment1", IntNumber, "segment2") how can one generate a Route? There seems to be no good way of doing this. I have tried path(list.reduceLeft(_ / _)) which does not work as it's ...
3
votes
1answer
134 views
Recursive transformation between nested case classes where the fields in the target are unaligned subsets of the source class
Given a pair of case classes, Source and Target, that have nested case classes, and at each level of nesting, the fields in Target are unaligned subsets of the ones in Source, is there a way to write ...
1
vote
2answers
15 views
Cast Exception When trying to cast a Coordinate Matrix
Hi I am very new to Scala and Spark. I am writing a test to check the integrity of my data. For this I have a Coordinated Matrix and I map it with the ResultMap. Now in my Testing method I need to ...
2
votes
3answers
85 views
Scala DSL: invocation that mimics English
I'm very new to scala and this is a more of a question of curiosity.
Let's say I have a class
class Container()
{
def add(item: Item) ...
}
I can invoke it like this: container add item.
I ...
1
vote
1answer
20 views
Optional values with Esper
Is there any support in Esper for properties with Optional values?
I'm building a POC with Esper, using Scala, and I have the EPL query working with non-optional values. I have a simple object that I'...
1
vote
1answer
34 views
create custom JsObject out of case class
im getting in my api some case class that looks like this:
case class UserUpdate(
name: Option[String],
age: Option[String],
city: Option[String])
from this case class I need to build update ...
2
votes
1answer
38 views
Slight difference in Future.zip and Future.zipWith implementation. Why?
Let's consider the following excerpt from scala.concurrent.Future.scala:
def zip[U](that: Future[U]): Future[(T, U)] = {
implicit val ec = internalExecutor
flatMap { r1 => that.map(r2 =>...
0
votes
0answers
24 views
Error installing sbt on Ubuntu 18.04: `gpg: keyserver receive failed: Invalid argument`
I'm following the official sbt install instructions.
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
Executing: /tmp/apt-key-gpghome....
0
votes
1answer
26 views
apache spark - Inserting a dataframe as nested struct into other dataframe
I have the two dataframes created in spark
xml_df:
root
|-- _defaultedgetype: string (nullable = true)
|-- _mode: string (nullable = true)
and nodes_df:
root
|-- nodes: struct (nullable = false)
...
0
votes
0answers
18 views
Ignoring schema mismatches between source and destination tables in cassandra using spark sql
i want to transfer data from 1 cassandra table to another.But if some column is missing in table, should ignore and insert remaining columns and if some column is missing in dataframe, that column ...
2
votes
0answers
22 views
sbt scripted plugin fails as an unresolved dependency on publishing cross-compiled scala versions
Our play-googleauth library is built on Scala 2.12, and cross-compiled to Scala 2.11, using sbt 1.1.6. As the library is intended to be run in Play projects, we've historically provided an example ...
0
votes
1answer
408 views
KafkaConsumer : erro on consumer.subscribe(Arrays.asList(topic)) error
I'm trying to config my KafkaConsumer to read data from KafkaProducer by the 'kafkatopic'.
My scala code is :
package com.sundogsoftware.sparkstreaming
import java.util
import java.util....
1
vote
1answer
23 views
java.io.FileNotFoundException: Not found cos://mybucket.myservicename/checkpoint/offsets
I'm trying to use Spark Structured Streaming 2.3 to read data from Kafka (IBM Message Hub) and save it into IBM Cloud Object Storage on a 1.1 IBM Analytics Engine Cluster.
After creating the cluster, ...
0
votes
0answers
25 views
In Spark,while writing dataset into database it takes some pre-assumed time for save operation
I ran the spark-submit command as mentioned below,which performs the Datasets loading from DB,processing,and in final stage it push the multiple datasets into Oracle DB.
./spark-submit --class com....
0
votes
2answers
20 views
define nested strucType
I have the following question:
Having an Option of List of elements I want to transform it into a List of elements by avoiding the use of .get on the Option.
Below is the snippet of code that I want ...
-2
votes
0answers
51 views
+150
WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout,
I have spark job-which is reading file, connecting to DB server doing some calculations and them create different .csv file ,since yesterday its continue to fail with this error WARN ...
9
votes
2answers
4k views
How to find file size in scala?
I'm writing a scala script, that need to know a size of a file. How do I do this correctly?
In Python I would do
os.stat('somefile.txt').st_size
and in Scala?
0
votes
1answer
46 views
How to compose curried functions in Scala
Is it possible compose functions in Scala that are curried? For example:
def a(s1: String)(s2: String): Int = s1.length + s2.length
def b(n: Int): Boolean = n % 2 == 0
def x : String => String =&...
0
votes
0answers
27 views
Insert into file from Scala using bash command
I want to append new lines (to the end) to a textFile using Spark (1.6) with Scala (2.10.6).
The problem is that RDD does not allow append content and with DF you can't have several columns.
This ...
0
votes
1answer
32 views
Scala-Not compiling for comprehension
I am trying to run the following code:
def split(input: Int): List[Int] = {
val inputAsString = input.toString
val inputAsStringList = inputAsString.split("").toList
inputAsStringList.map(_....
-2
votes
1answer
24 views
Extract date from String in scala
I have this text as String "Report Date 2018-05-04 ""Report Run Date"" 2018-05-05"
In the above I want to print 2018-05-04. I am able to do this with substring method but that's not right way and ...
0
votes
1answer
18 views
Accessing the remote class method to resolve cyclic dependency
I am working with multimodule maven project, where I am facing the cyclic dependency between the driver(module A) and actors(module B)
Actors module has a common creation of actor system which ...
0
votes
2answers
16 views
Is it possible to include directory information in spark.read.csv?
Scenario:
I wrote CSV data with something like
df.write.partitionBy("foo", "bar").csv("hdfs:///quux/bletch")
The CSV files in the hdfs://quux/bletch/foo=baz/bar=moo directories all lack the foo and ...
0
votes
1answer
59 views
What is the significance of 'E' in regex_replace?
In our project, we move the data from tables on RDBMS to HDFS using Scala and Spark. Before moving the data, we apply a "regex_replace" on the data to eliminate some discrepancies in the data. Below ...